Atakama integration with Data Discovery & Classification tools

TABLE OF CONTENTS

There are numerous data discovery & classification (DDC) solutions that are compatible with Atakama. Atakama best integrates with DDC providers that support either the .ip-labels and MSIP labels classification file or Atakama's command-line-encrypt data flow (as detailed below).

 

1. Command-line-encrypt data integration (works with most providers)


Once installed on the DDC system's server, the file locations, either cloud or on-prem network, need to be set up as Atakama Secure Folders


The server running the DDC will need to be granted access to the Secure Folders. If this is not desired, a second factor key can be created, then the app removed and backup discarded. Future versions of Atakama will support write-only installations. See method #2 below.


You will then need to configure the DDC to run the following script when a file is classified and needs to be encrypted:

 

C:\ProgramFiles\Atakama\Atakama --protect %path%

 

Where "%path%" is the placeholder the data classification tool uses.

 

This will result in files being encrypted as they are classified. 


2. Flexible integration (works with some providers)


This option requires the "Classifier Auto-Encryption" in Options tab of the Control Center to be enabled. When enabled for the first time, you will need to quit and restart Atakama.


Note: Files that already exist in Secure Folder location before this option is enabled will not be automatically encrypted.


Some DDC providers allow writing a file containing a summary of classification information called “.ip-labels” (Information Protection Labels). Ideally, these files should be read-only for all but the DDC system itself.

 

This setup does not require installation of Atakama on the DDC server.

  • The format of the file is JSON, and must validate as valid JSON.

  • The file should be written to the root of any folder that contains labelled files.

  • The file must be hidden on Windows systems.

  • The file must be protected so that only administrators, or a suitably restricted group of network users, have access to write it.

  • The file should allow all users who have access to that location to read from it.

 

Example of an .ip-labels file:


{
    "run_start_time": "10/19/2020 17:05:01 (UTC-05:00) Eastern Time (US & Canada)",
      "run_guid": "6abff42d-16cd-4312-ab29-9451b2e838fe",
      "files": {
            "example_file_1": {
                  "labels": ["US SSN"],
                  "hash": "A634F1412E52CD3AB966EA47A2B6CD1C"
            },
            "example_file_2.docx": {
                  "labels": ["Credit Cards", "US Drivers License"],
                  "hash": "17F15BAC04491499F13B929D3CE9F759"
            },
            "example_file_3.xls": {
                  "labels": ["Legal Keywords", "Medical Diagnoses", "Passwords"],
                  "hash": "FCEB0ED43B1C4A45E4109F31EC561980"
            },
            "example_file_4.ext": {
                  "labels": ["Legal Keywords"],
                  "hash": "9A50A1817C4B0ABA32B9699EFBA9033F"
            }
      },
      "signature":"9A50A1817C4B0ABA32B9699EFBA9033F"
}


Atakama will detect when this file has been written and will encrypt files when the DDC applies labels to them. This setup works with most any topology - Atakama can be running on users' workstations only (i.e., serverless), and the workstation will obey the .ip-labels file and encrypt files.  


Field Specification:

  • files: (required) the keys are the names of the files, with extensions, the values are a dictionary of 

  • labels: (required) a list of utf8 label names

  • hash: (optional) md5sum of the file, used to validate that the labels file is recent / up-to-date

  • run_start_time: (optional) date/time, format unspecified, not used by Atakama

  • run_guid: (optional) unique ID representing the latest processing initiative that produced the file format unspecified, not used by Atakama

  • signature: (optional) the ECDSA signature of the sorted, minified JSON contents, excluding the signature itself. The default curve used is SECP256K1, but other options are available. (Configurable via the enterprise configuration)

3. Microsoft Information Protection labels (MSIP)

Prerequisite:

  1. This option requires the "Classifier Auto-Encryption" in Options tab of the Control Center to be enabled. When enabled for the first time, you will need to quit and restart Atakama.

  2. Open atakama.config file in "%homepath%" and find "auto_encrypt".

  3. By default, “iplabel” is added to the “detector_list”. Replace it with “metadata” or add another line if you want to use iplabel.

              4. Save atakama.config and close it.

              5. Launch Atakama.

Note: Files that already exist in Secure Folder location before this option is enabled will not be automatically encrypted.

 

Atakama can be configured to recognize MSIP labels that have been embedded in Microsoft Office documents. MSIP labels cannot be embedded in non-Microsoft Office files.

 

This is a viable solution for organizations that use Microsoft Office documents for sensitive information.

 

Simply configure the DDC to embed MSIP labels and Atakama will automatically detect and encrypt files that have been labeled.

 

For other embedded labels, such as EXIF labels or other custom formats, Atakama can be configured with a REGEX that recognizes these labels as well. There may be some complexity implementing this solution so we suggest contacting Atakama support for assistance.



Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.