We have long since entered an age of unbridled data growth. The almost limitless recording of every interaction, every activity and every process is generating more and more data.
The IDC (International Data Corporation) predicts that a data volume of 175 zettabytes will be generated worldwide by 2025.
| 1 zettabyte (1ZB) = 10^21 bytes
| On stacked hard disks, 175 zettabytes would be 2.31 times the size of the moon
The data is not just limited to big data-relevant IoT telemetry or measurement data from production lines: additional parallel sources of information are being generated by the growing number of internal communication channels and attempts to combine and pool resources. A lack of streamlining processes for internal resources, structures and knowledge management tools is leading to a constant increase in the amount of data in companies.
Employees spend hours searching for information
Employees spend a substantial part of their working week, around 8.8 hours, searching for information in this data - be it for the port of a particular service or the internal extension number of the boss [9]. Not all systems offer a comprehensive search "like Google", in some cases none at all. If search engines exist, they differ greatly in their function and scope - tools such as Jira, for example, offer a query language, while chats sometimes do not even allow a cross-channel search. If no search engine is available, employees have to fight their way through folder structures and page hierarchies and at least roughly know where to find what they are looking for.
The promise of enterprise search
The field of enterprise search addresses this problem and attempts to provide a single, all-encompassing search within the company that can satisfy the information needs of all employees. Commercial providers such as Splunk and Amazon offer cloud-based solutions, but open source software such as Elastic Search and Solr can also be used as enterprise search.
The promise is simple: using enterprise search should significantly reduce the time employees spend searching for data. To this end, all data sources are made searchable with a single search.
The providers promise a high return on investment:
Google uses "conservative figures" [4, p. 4] to calculate the RoI in medium-sized companies of over 22 million euros [4], John Lenker from LucidWorks calculates a 15 to 20-fold RoI per year [5].
It's more complex than on the web
However, it is questionable whether these figures are correct. Basically, these are rough estimates; in most cases, potentially high training and support costs are not even taken into account.
The results of the few scientific studies on the profitability and effectiveness of enterprise search are sobering: company-wide search options quickly reach their limits due to the same patterns: the larger a company and the more diverse the islands of knowledge, the more inefficient a search across all information can become.other soft factors are the structures and hierarchies in a company, the existence of a basic understanding of the search as well as the quality and preparation of the underlying data [6].
Searching an intranet or internal databases is much more complex than searching the web:
Data from various sources, in a wide variety of formats, which may be modular or incomplete and only aggregated during a search, is relevant. In addition, individual and fine-grained authorization models must be taken into account. Data protection requirements further increase the effort involved.
The searcher not only expects various structured data to be made searchable, but also expects to find the right result, not the best possible result [7,8].
It also depends on the employees
Further problems arise from the search habits and search behavior of employees.
Three groups of intranet searchers can be distinguished, whose requirements are sometimes incompatible (e.g. recall vs. precision [10]) [11]: inexperienced users and occasional searchers (approx. 80%), interactive users and intensive users (approx. 14%) and "information-search-savvy employees", . To reduce the cognitive load, some users have developed different strategies. For example, they sometimes navigate through folder and page hierarchies in order to avoid formulating a complex query that would have allowed them to jump directly to the result.
The current hype surrounding enterprise search should be treated with caution. When calculating the RoI, not only potential costs, but also the suitability of the company's internal structures and employee skills must be taken into account.
Sources:
[1] https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf
[2] https://sci-hub.se/10.1145/2637748.2638425
[3] https://sci-hub.se/10.1145/985692.985745
[4] https://static.googleusercontent.com/media/194.78.99.204/en/204/enterprise/search/files/Internal_Search_ROI.pdf
[5] https://de.slideshare.net/lucidworks/measuring-roi-on-enterprise-search-john-lenker-lucidworks
[6] https://edoc.hu-berlin.de/bitstream/handle/18452/17001/bertram.pdf?sequence=1&isAllowed=y
[7] Mani Abrol, Neil Latarche, Uma Mahadevan, Jianchang Mao, Rajat Mukherjee, Prabhakar Raghavan, Michel Tourn, John Wang, and Grace Zhang. Navigating large-scale semi-structured data in business portals. very large data bases, pages 663-666, 2001.
[8] Rajat Mukherjee and Jianchang Mao. Enterprise Search: Tough Stuff. ACM Queue, 2(2):36-46, 2004.
[9] McKinsey Global Institute. The social economy: Unlocking value and produc- tivity through social technologies. 2012.
[10 ] https://de.wikipedia.org/wiki/Beurteilung_eines_bin%C3%A4ren_Klassifikators#Anwendung_im_Information_Retrieva
[11] Dick Stenmark. Identifying clusters of user behavior in intranet search engine log files. Journal of the American Society for Information Science and Technology, 59(14):2232-2243, dec 2008. doi:10.1002/asi.20931.