PDA

View Full Version : How to handle Session Management - Bots and Spiders



JonC
February 2nd, 2012, 01:49 PM
First and foremost we suggest creating a robots.txt file in the web root of the domain to address two issues. First to control the rate at which the website is being crawled which can help prevent a bot/spider from creating a massive number of database connections at the same time. Second to prevent specific bots from crawling the website. We suggest the following defaults, however you might want to add or remove the user agents denied, and adjust the crawl rate but we suggest nothing lower than 3 seconds.


User-agent: *
Crawl-delay: 10

User-agent: Baiduspider
Disallow: /

User-agent: Sosospider
Disallow: /

Next we suggest setting your session timeout specifically lower for bots and spiders. These spiders and bots will crawl a page and when a session (ColdFusion) is created, it will persist during then entire page load. The page fully loaded allows the bot or spider to get the information from the webpage AND allows the session to expire quickly protecting ColdFusion from effects similar to a memory leak.

Session Management code examples for the Applicaiton.CFM (http://www.bennadel.com/blog/1083-ColdFusion-Session-Management-And-Spiders-Bots.htm)

Application.CFC code instructions below, adjusting the timeout to your applications requirements:

Replace within a cfscript block:

THIS.sessionTimeout = createTimeSpan(0,0,60,0);

With:


<!--- This checks if a cookie is created, for bots this will return false and use the low session timeout --->
if (StructKeyExists(cookie, "cfid")){
THIS.sessionTimeout = createTimeSpan(0,0,60,0);
} else {
THIS.sessionTimeout = createTimeSpan(0,0,0,2);
}