Entry tags:
WELCOME TO THE FUTURE
I discovered yesterday that the campus Google appliance had gone bananas on our FAQ service, hitting it up for three specific search terms every five minutes or so.
Since May. Since 11:08 AM on May 13th, to be precise.
I'd discovered that I'd done searches on those three terms at the time the Google bot started its journey into darkness, which presumably triggered it, but I'd also done a number of searches on other terms at the same time because I was updating all the FAQs associated with them. No idea why the bot chose to obsess over "copyright," "texshare," and "new material alerts." The database that holds the logs for these queries is now overflowing with approximately 30K hits per month since May, rendering the stats tools in the service unusable.
I talked with campus IT and while the ultimate cause of why it picked those three terms is unknown, it was having problems with the searches not completing, so it was repeating them and repeating them. After discussion, I picked the option of not having the campus search index the FAQs to stop the madness. Most of our users use regular Google to search for stuff related to the campus and the library tbh, and I've got a custom Google search on the site, not the campus search, so it'll affect us not at all, and I've got a support ticket into the service asking them to delete all the records that come from that specific IP address.
So, um, yay technology?
Since May. Since 11:08 AM on May 13th, to be precise.
I'd discovered that I'd done searches on those three terms at the time the Google bot started its journey into darkness, which presumably triggered it, but I'd also done a number of searches on other terms at the same time because I was updating all the FAQs associated with them. No idea why the bot chose to obsess over "copyright," "texshare," and "new material alerts." The database that holds the logs for these queries is now overflowing with approximately 30K hits per month since May, rendering the stats tools in the service unusable.
I talked with campus IT and while the ultimate cause of why it picked those three terms is unknown, it was having problems with the searches not completing, so it was repeating them and repeating them. After discussion, I picked the option of not having the campus search index the FAQs to stop the madness. Most of our users use regular Google to search for stuff related to the campus and the library tbh, and I've got a custom Google search on the site, not the campus search, so it'll affect us not at all, and I've got a support ticket into the service asking them to delete all the records that come from that specific IP address.
So, um, yay technology?