Class CrawlerSessionManagerValve

  • All Implemented Interfaces:
    javax.management.MBeanRegistration, Contained, JmxEnabled, Lifecycle, Valve

    public class CrawlerSessionManagerValve
    extends ValveBase
    Web crawlers can trigger the creation of many thousands of sessions as they crawl a site which may result in significant memory consumption. This Valve ensures that crawlers are associated with a single session - just like normal users - regardless of whether or not they provide a session token with their requests.
    • Constructor Detail

      • CrawlerSessionManagerValve

        public CrawlerSessionManagerValve()
        Specifies a default constructor so async support can be configured.
    • Method Detail

      • setCrawlerUserAgents

        public void setCrawlerUserAgents​(java.lang.String crawlerUserAgents)
        Specify the regular expression (using Pattern) that will be used to identify crawlers based in the User-Agent header provided. The default is ".*GoogleBot.*|.*bingbot.*|.*Yahoo! Slurp.*"
        Parameters:
        crawlerUserAgents - The regular expression using Pattern
      • getCrawlerUserAgents

        public java.lang.String getCrawlerUserAgents()
        Returns:
        The current regular expression being used to match user agents.
        See Also:
        setCrawlerUserAgents(String)
      • setCrawlerIps

        public void setCrawlerIps​(java.lang.String crawlerIps)
        Specify the regular expression (using Pattern) that will be used to identify crawlers based on their IP address. The default is no crawler IPs.
        Parameters:
        crawlerIps - The regular expression using Pattern
      • getCrawlerIps

        public java.lang.String getCrawlerIps()
        Returns:
        The current regular expression being used to match IP addresses.
        See Also:
        setCrawlerIps(String)
      • setSessionInactiveInterval

        public void setSessionInactiveInterval​(int sessionInactiveInterval)
        Specify the session timeout (in seconds) for a crawler's session. This is typically lower than that for a user session. The default is 60 seconds.
        Parameters:
        sessionInactiveInterval - The new timeout for crawler sessions
      • getClientIpSessionId

        public java.util.Map<java.lang.String,​java.lang.String> getClientIpSessionId()
      • isHostAware

        public boolean isHostAware()
      • setHostAware

        public void setHostAware​(boolean isHostAware)
      • isContextAware

        public boolean isContextAware()
      • setContextAware

        public void setContextAware​(boolean isContextAware)
      • initInternal

        protected void initInternal()
                             throws LifecycleException
        Description copied from class: LifecycleMBeanBase
        Sub-classes wishing to perform additional initialization should override this method, ensuring that super.initInternal() is the first call in the overriding method.
        Overrides:
        initInternal in class ValveBase
        Throws:
        LifecycleException - If the initialisation fails
      • invoke

        public void invoke​(Request request,
                           Response response)
                    throws java.io.IOException,
                           ServletException
        Description copied from interface: Valve

        Perform request processing as required by this Valve.

        An individual Valve MAY perform the following actions, in the specified order:

        • Examine and/or modify the properties of the specified Request and Response.
        • Examine the properties of the specified Request, completely generate the corresponding Response, and return control to the caller.
        • Examine the properties of the specified Request and Response, wrap either or both of these objects to supplement their functionality, and pass them on.
        • If the corresponding Response was not generated (and control was not returned, call the next Valve in the pipeline (if there is one) by executing getNext().invoke().
        • Examine, but not modify, the properties of the resulting Response (which was created by a subsequently invoked Valve or Container).

        A Valve MUST NOT do any of the following things:

        • Change request properties that have already been used to direct the flow of processing control for this request (for instance, trying to change the virtual host to which a Request should be sent from a pipeline attached to a Host or Context in the standard implementation).
        • Create a completed Response AND pass this Request and Response on to the next Valve in the pipeline.
        • Consume bytes from the input stream associated with the Request, unless it is completely generating the response, or wrapping the request before passing it on.
        • Modify the HTTP headers included with the Response after the getNext().invoke() method has returned.
        • Perform any actions on the output stream associated with the specified Response after the getNext().invoke() method has returned.
        Parameters:
        request - The servlet request to be processed
        response - The servlet response to be created
        Throws:
        java.io.IOException - if an input/output error occurs, or is thrown by a subsequently invoked Valve, Filter, or Servlet
        ServletException - if a servlet error occurs, or is thrown by a subsequently invoked Valve, Filter, or Servlet