Digitizing a company's history is a crucial step in extracting value from its past. Today, thanks to advancements in digitization technologies and artificial intelligence algorithms, this process can be done quickly and effectively.
DIGITIZE THE COMPANY ARCHIVES
When we think of an archive, we often picture a dusty place that holds little value or accessibility for outsiders. However, in many cases, these archives contain valuable assets in the form of information that is difficult to access.
The inability to access these hidden contents contradicts our expectations in a world where smartphones have become an integral part of our daily lives, granting us instant access to a network of apps for work, leisure, and personal time. Undoubtedly, our lifestyles have changed significantly. Social media, for instance, has become an essential tool for corporate communication, and the Internet of Things (IoT) allows us to connect with everything and everyone.
It has long been evident that this social and technological transformation prompts us to consider the opportunities brought about by innovation. Online newspapers, e-commerce portals, training, and corporate marketing strategies now heavily rely on the internet as their main communication channels.
Thus, there is a need to create a sort of "big data" of the past, starting with the company's documents, which would enrich the realm of internet content.
This phenomenon is further confirmed by the explosion of internet content, which presents numerous opportunities but also obscures the limited amount of historical information. The web is inundated with recent content and news from the past 25-30 years, while only well-known and significant events have managed to emerge.
This trend is illustrated in Figure 1, showing the increase in internet content over time. It is evident that the volume of information related to the past grew linearly with a slow rate of increase until the 2000s. However, from that point onward, the curve became exponential. This can be attributed to the availability of native digital data and the ease of accessing it through tools like smartphones and the cloud.
Therefore, the question arises: How can we ensure that the past is adequately represented? Figure 2, named the information Mushroom, simulates the effect of digitizing the "hidden" contents within archives. The graph on the left side (Figure 1) is reversed and mirrored, resembling a mushroom with a narrow stem and a large head. On the right side, a new goblet shape is proposed, representing the wider base resulting from the digitization of past documents.
TECHNOLOGIES
In this context, bridging the "productive" gap requires addressing the technological and organizational deficiencies in the traditional processes of digitizing historical archives.
The belief is that by reorganizing processes and adopting innovative technologies, a significant transformation can be achieved in the realm of digitization, specifically with regards to company archives.
To overcome the longstanding hindrance to the development of mass digitization processes, the primary challenge lies in enhancing the efficiency of these processes. Traditionally, organizational models and technologies have been non-standardized, relying on highly artisanal approaches with limited productivity levels, albeit with generally high-quality outcomes but insufficient attention to technological innovation.
The key turning point lies in technological innovation within these processes, particularly in harnessing "new" enabling technologies.
Present-day advancements in AI tools and algorithms necessitate the integration of this technology into areas such as digitizing business archives.
Building upon the aforementioned considerations, it becomes crucial to employ advanced, up-to-date, and standardized technologies. A comprehensive management system for digitization processes should encompass hardware and software components, seamlessly integrated to meet the requirements of a digitization center. These components can be summarized as follows:
- organizational, project, and activity management
- applications for scanners
- image processing for refining the digitized images
- extraction of text content from images (OCR - HTR)
- creation and management of metadata
- information system for preservation, database, and data storage
- content publication and communication
- collaborative data interoperability (LOD)
This amalgamation forms a kind of "black box" with its core being the software that powers various digitization processes. These processes span from project management and image production to data storage and publication.
A suite of purpose-built applications, tailored to different types of objects (e.g., books, photos, paintings, etc.), are interconnected to facilitate the digitization workflow systematically.