The application enables organizations to make optimum use of the capabilities of the SharePoint platform and it improves user productivity and adoption. Using metadata in SharePoint columns allows for filtering, sorting, grouping and powerful searches. For example, it is possible to search using the same syntax as in Outlook:
Select documents using CAML
Extract properties from documents
Upload metadata to SharePoint
See all features below
|Extract metadata from emails (msg and eml) stored in SharePoint||✓|
|Select emails using CAML||✓|
|Performance||Typically 10 documents per second|
|Support for large lists||✓|
|Caters for SharePoint throttling||✓|
|Extensible||✓ (e.g., pdf, jpg, docx, ...)|
BulkMetadata is a node.js application that runs on a separate computer system. It extracts properties present within SharePoint documents (e.g., extract the sent date, subject, … from emails) and uploads the extracted values into SharePoint columns.
The application enables organizations to make optimum use of the SharePoint platform and improve productivity and user adoption. Using metadata in SharePoint columns allows for filtering, sorting, grouping and powerful searches. For example, it is possible to search using the same syntax as in Outlook:
Basically, the application:
– downloads documents based on the provided CAML query
– extracts the document properties
– uploads the metadata to SharePoint columns.
Common use cases:
– add metadata to emails stored in SharePoint
– extract geolocation details from images stored in SharePoint
– extract customer / project details from pdf’s
The following figures show the document details before and after running the application.
The following properties can be extracted from emails stored in SharePoint:
The application generates multiple log files: the summary log file holds high level details and there is a detailed log file with the results for the individual documents.
The application is optimized for performance by using parallel processes for downloading documents and setting metadata, caching, and other optimization techniques. Under optimized conditions the application can process 10 documents per second. The application is build to handle very large number of documents (100000’s and more). In addition, the application architecture allows for scaling out by using separate computer systems.
BulkMetadataManager can handle throttling by SharePoint Online and will automatically resume after the retry-after period has expired. The application also supports large lists exceeding the list view threshold (5000 items).
The WhatIf option allows for assessing the application and the configuration without uploading the metadata to SharePoint. The extracted document properties are saved to a separate csv file for further analysis.
Install BulkMetadataManager and node.js and the application on a separate computer. Configure the metadata mapping using tenant properties or static properties file.
BulkMetadataManager supports emails (msg and eml) stored in SharePoint Online. Contact us if you have specific requirements.
The following enhancements will be implemented in near future releases:
– support for other file formats such as pdf, jpg, Office files (docx, xlsx and pptx), …
– support for importing metadata from csv files (to complement the extracted document properties)
– support for on-premises SharePoint versions
The target audience for running the application is SharePoint administrators and site collection administrators. It is not intended for use by end-users.