The application enables organizations to make optimum use of the capabilities of the SharePoint platform and improve user productivity and adoption. Using metadata in SharePoint columns allows for filtering, sorting, grouping and powerful searches. Lastly, it helps organizations meet compliance criteria (e.g., destroy content after 7 years).
Select documents using CAML
Extract properties from documents
Upload metadata to SharePoint
See all features below
BulkMetadata extracts properties present within SharePoint documents (e.g., sent date, subject, … from emails, keywords from PDF files, last modified date from Office files), and uploads the extracted values into SharePoint columns. The application uses Node.js and runs on a separate computer system.
The application enables organizations to make optimum use of the SharePoint platform and improve productivity and user adoption. Using metadata in SharePoint columns allows for filtering, sorting, grouping and powerful searches. By extracting the created / modified dates from the actual documents (docx, pdf, msg, …) and not the import date, organizations can ensure data is retained/destroyed conform legal requirements.
Basically, the application:
– downloads documents based on a CAML query
– extracts the document properties
– uploads the metadata to SharePoint columns.
Common use cases:
– add metadata to emails stored in SharePoint
– extract order details from PDF files residing in SharePoint
– extract last modified dates from Office files in SharePoint (and not the date then the document was uploaded)
The following figures show the document details before and after running the application.
Supported file types and properties
The following properties can be extracted from documents stored in SharePoint:
Recover original create and modify dates
BulkMetadataManager allows for using the original create and modify dates present within files as SharePoint Created and Modified dates.
After extraction the original create and modify dates are used as Created and Modified dates
The application generates multiple log files:
– a summary log file with high level details
– a detailed log file with details for the individual documents
– a whatif log file with the details of the extracted properties
The application is optimized for performance by using parallel processes for downloading documents and setting metadata, caching, and other optimization techniques. Under optimized conditions the application can process 10 documents per second. The application is build to handle very large number of documents (100000’s and more). In addition, the application architecture allows for scaling out by using separate computer systems.
BulkMetadataManager can handle throttling by SharePoint Online and will automatically resume after the retry-after period has expired. The application also supports large lists exceeding the list view threshold (5000 items).
The WhatIf option allows for assessing the application and the configuration without uploading the metadata to SharePoint. The extracted document properties are saved to a separate csv file for further analysis.
This comprises of the following steps:
– install BulkMetadataManager and Node.js on a separate computer
– add/deploy Content Type to the SharePoint librarie(s) (if not already present)
– configure the metadata mapping using tenant properties or a static properties file
– configure the ini file for your environment (this allows for granular control which sites and/or libraries are processed)
– run the tool via the cmd line
BulkMetadataManager can also be scheduled via the standard Windows Task Scheduler. This allows for extracting metadata from documents independent of the way they have been added to SharePoint (via web interface, OneDrive for Business, mobile apps, Power Automate, external applications, …).
The use of a CAML query allows for granular control which documents are to be processed. BulkMetadataManager will only process documents selected using the CAML query and where the content type is present in the document’s library.
BulkMetadataManager supports SharePoint Online. Contact us if you have specific requirements.
The following enhancements will be implemented in future releases:
– support for other file formats such as jpg, png, …
– support for importing metadata from csv files (to complement the extracted document properties)
– support for labelling documents (retention/sensitivity)
The target audience for running the application includes SharePoint administrators and site collection administrators. It is not intended for use by end-users.