
Bulk Metadata Management
BulkMetadataManager is an application designed to extract properties from existing SharePoint documents, such as the sent date of an email, keywords in a PDF file, the “Last Modified” date from an Office file, the “Date Taken” of a photo and the GPS location from a drone recording. These extracted values are then captured into SharePoint columns without adding a new document version or changing the SharePoint “Modified” or “Modified by” columns.
Supported File Types:
- Emails: msg and eml.
- Documents: docx, xlsx, pptx, vsdx and pdf.
- Images: jpg, png, dng, heic, heif, tiff, jfif and cr2.
- Videos: mp4, mov and wmv.
BulkMetadataManager also supports extracting unique words from email attachments (pdf, docx, xlsx, pptx, vsdx, xml, csv, txt, zip), enabling users and compliance officers to search for content within email attachments.
Performance and Scalability:
- Built to handle large volumes of documents (100,000+).
- Optimized for high performance.
- Support for horizontal and vertical scaling.
Benefits: BulkMetadataManager empowers organizations to fully leverage the capabilities of the SharePoint platform, enhancing user productivity and adoption. Utilizing metadata in SharePoint columns allows for efficient filtering, sorting, grouping, and powerful searches. Additionally, it helps organizations meet compliance requirements.
Select documents

Extract properties from documents

Upload metadata to SharePoint

Product Features

See all features below
Feature | BulkMetadataManager |
---|---|
Support for emails: msg and eml | ✓ |
Support for documents: docx, xlsx, pptx, vsdx and pdf | ✓ |
Support for images: jpg, png, dng, heic, heif, tiff, jfif and cr2 | ✓ |
Support for videos: mp4, mov and wmv | ✓ |
Configurable: Select specific sites/libraries, use wildcards, exclusions, ... | ✓ |
Granular Control: Select files using CAML queries | ✓ |
Performance: Up to 5-10 documents per second | ✓ |
Large List Support: Efficiently handles large lists | ✓ |
Throttling Management: Caters for SharePoint throttling | ✓ |
WhatIf Support: Provides WhatIf functionality | ✓ |
Logging: Comprehensive logging capabilities | ✓ |
Scalability: Designed to be scalable | ✓ |
Extensibility: Extensible to support additional file types (e.g., zip, odt, ...) | ✓ |
System Resources: low-end computer (e.g. laptop, desktop or small VM) | ✓ |
Leverage metadata
Overview
BulkMetadataManager extracts properties from SharePoint documents (e.g., sent date and subject from emails, keywords from PDF files, last modified date from Office files) and captures these values into SharePoint columns. The application runs on a separate computer system and does not require any user interactions.
Benefits: BulkMetadataManager enables organizations to fully leverage the SharePoint platform, enhancing productivity and user adoption. Using metadata in SharePoint columns allows for efficient filtering, sorting, grouping, and powerful searches. By extracting the created/modified dates from the actual documents (docx, pdf, msg, jpg, etc.) rather than the import date, organizations can ensure data is retained or destroyed in line with legal requirements.
How It Works:
- Downloads documents based on a CAML query.
- Extracts the document properties.
- Uploads the metadata to SharePoint columns.
Common Use Cases:
- Add metadata (e.g., sent date) to emails stored in SharePoint.
- Extract order details from PDF files stored in SharePoint.
- Extract last modified dates from emails, PDF files, and Office files using the properties stored within the files (not the upload date).
- Make email attachments (pdf, docx, xlsx, pptx, vsdx, xml, csv, txt, zip) searchable.
- Extract GPS coordinates and ‘Date Taken’ from images/videos to create a digital library in SharePoint.
BulkMetadataManager in-depth
Metadata extraction
The following figures show the document details before and after running the application.
Metadata before | Metadata after |
![]() |
![]() |
Supported file types and properties
The following properties can be extracted from files stored in SharePoint:
Emails (msg and eml) |
Office Files (docx, xlsx, pptx and vsdx) |
PDF Files (pdf) |
Images (jpg, png, dng, heic, heif, tiff, jfif and cr2) |
Videos (mp4, mov or wmv) |
Subject From From address To To address CC CC address BCC BCC address Message ID Sent Date Received Date Conversation has Attachment Importance Sensitivity Categories Attachment count Attachment names Attachment contents (unique words) |
Title Tags Comments Status Categories Subject Company Manager Hyperlink Base Last Modified Created Last Printed Author Last Modified By Revision Legal Entity Project Language Total Editing Time Template Pages Words Characters (no spaces) Characters (with spaces) Paragraphs Lines Application DocSecurity AppVersion |
Title Author Comments Subject Keywords PDF Producer PDF Version Fast Web View Number of Pages Tagged PDF Modified Created Application Page Layout Metadata Date Format Document ID Instance ID Language |
DateTime GPS Coordinates GPS Latitude (DD) GPS Longitude (DD) GPS Altitude EXIF properties XMP properties IPTC properties ICC properties |
DateTime GPS Coordinates GPS Latitude (DD) GPS Longitude (DD) GPS Altitude EXIF properties XMP properties IPTC properties ICC properties |
Recover original create and modify dates
BulkMetadataManager allows for using the original create and modify dates present within files as SharePoint Created and Modified dates.
Existing situation
After extraction the original create and modify dates are used as Created and Modified dates
Making Email Attachments Searchable
SharePoint Search does not index all email attachments. Our tests across different tenants indicate that the indexing rate is well below 50%, leading to significant blind spots in the search index. This limitation affects users and compliance officers, as emails may not appear in search results.
BulkMetadataManager addresses this issue by ensuring full visibility of SharePoint data, eliminating blind spots. In short, BulkMetadataManager is essential for complying with privacy and other regulations.
Logging
The application generates multiple log files to provide comprehensive insights:
- Summary Log File: Contains high-level details.
- Detailed Log File: Includes specific details for individual documents.
- WhatIf Log File: Provides details of the extracted properties when the WhatIf property is set to true.
Performance
The application is optimized for high performance through the use of parallel processes for downloading documents and setting metadata, caching, and other optimization techniques. Under optimal conditions, the application can process 10 documents per second. It is designed to handle very large volumes of documents (100,000+). Additionally, the application architecture supports scaling out by utilizing separate computer systems.
BulkMetadataManager can manage throttling by SharePoint Online and will automatically resume operations after the retry-after period has expired. The application also supports large lists that exceed the list view threshold (5,000 items).
Scalability
BulkMetadataManager supports both vertical and horizontal scaling. Vertical scaling allows for increased processing throughput without the need to deploy additional infrastructure. This setup ensures scalability and isolation between instances while maintaining efficient metadata processing.
Horizontal scaling uses different systems to increase the processing throughput.
WhatIf
The WhatIf option enables assessment of the application and its configuration without uploading metadata to SharePoint. Extracted document properties are saved to a separate CSV file for further analysis.
Deployment
Deployment involves the following steps:
- Install Application: Install the BulkMetadataManager application (exe) on a separate computer (e.g., small VM, laptop, desktop, …).
- Add/Deploy Content Type: Add or deploy the Content Type to the SharePoint libraries (if not already present).
- Configure Metadata Mapping: Set up metadata mapping using a static properties file or tenant properties.
- Configure INI File: Adjust the ini file for your environment to allow granular control over which sites and/or libraries are processed.
- Update CAML File: Modify the CAML file to select the documents of interest.
- Run Tool: Execute the exe tool via the command line.
BulkMetadataManager can also be scheduled using the standard Windows Task Scheduler. This enables metadata extraction from documents regardless of how they were added to SharePoint (via web interface, OneDrive for Business, mobile applications, Power Automate, external applications, etc.).
Granular Control
The use of a CAML query allows for granular control over which documents are processed. BulkMetadataManager will only process documents selected using the CAML query and where the content type is present in the document’s library.
Supported Environments
BulkMetadataManager supports SharePoint Online. Contact us if you have requirements for on-premises SharePoint versions.
Enhancements
The following enhancements will be implemented in future releases:
- Support for importing metadata from CSV files (to complement the extracted document properties).
- Support for SharePoint 2019 and SharePoint Server SE.
- Support for labeling documents (retention/sensitivity).
- Support for binary Office files (doc, xls, ppt).
Target Audience
BulkMetadataManager is intended for use by SharePoint administrators and site collection administrators. It is not designed for end-users.
For Further Information
Check out the following resources for more details:
In short, BulkMetadataManager offers unique functionality to manage the metadata of your SharePoint documents. It enhances user productivity and ensures compliance with privacy, archiving, and other regulations.