How to prepare your data ahead of importing it into Pigment

  • 25 April 2022
  • 6 replies
  • 821 views

Userlevel 5
Badge +7

The purpose of this article is to provide guidance on the data files/formats that are to be used to import into Pigment. Included in the article is an Excel guide that includes an example structure of common use cases.
 

 

Before preparing your file to be imported, its important to note that all excel files must be converted into a .CSV before they can be imported into Pigment.  Check out this support article from Microsoft on how to save as .CSV


Tips for Data imports into Pigment 👩🏻‍🏫

 

  1. Separate Dimension IDs from Dimension Names

Where you have a dimension with IDs in your source system you should include:

- the IDs in a column,

- the Name of the dimension in a second column.

This will allow you to follow best practice to load your data with IDs, but, also have a display name based on the Name. If you don't have Ids, we can load using the 'Name', however, this can cause issues down the line if dimensions start to change names (E.g. Account 51000 - Revenue becomes Account 51000 - Revenues).

Note:  If you have a large transaction list, it might be better to import just the ID and maintain the Display names through a separate import into the dimension list.

2. Include a Unique Transaction ID 🆔

Unique transaction IDs increase sustainability and auditability.

They also allow you to update Pigment data by importing a file with the IDs that are already in the Transaction List. If you don't have a Unique ID, you either have to update records manually, or you have to delete the relevant transactions and upload them again.

3. Only include data that is required on Pigment in the data extracts 🎯

Less data = Faster Processing time

4. Include ‘Period’ as a column (if you don't have period, then have an 'Upload Date' column) 📅

This will allow the assignation of data to the relevant periods, and will allow us to track data over time. for example, the Cost Centre's headcount at the time each file was uploaded.

5. Remove unnecessary formatting 💷


Include minimal formatting in the data rows. Remove all special characters including currencies. If you need to include the currency in the import, include them in a separate column.

As a result we can upload the relevant columns in the format they will be used , for example, $200 would be uploaded in two columns, column 1 for currency "USD" and Column 2 for Amount which would be "200". 

 

The alternative would be that we upload these columns as text and then convert them, for example, $200 would be imported. We then have to created 2 additional properties within the transaction list to calculate the currency of each transaction and the value.  This against best practice as we are creating 3 columns, 2 with formulas of what could be achieved with 1. More Cells, larger processing time.


6. Import data using a flat file structure 🥿

It's generally better to have the file in a flat format, with all the data being formatted in columns

Flat files are more dynamic.If new items/columns are added to the pivoted section in a pivot upload, then the import would have to be rebuilt. Flat file imports wouldn't need to a new import if you added a new item.

7. Minimise the number of blank rows and missing data ❌

A lower number of blank rows, leads to a lower number of  transactions which results in faster  processing times and greater auditability. 

If fewer columns are missing data, then the less functionality that has to be implemented on Pigment to clean the data.

8. Remove all Sub Totals & Totals 🔢

All sub totals and totals will be calculated on Pigment in blocks or through the use of the hierarchies. Uploading totals only creates additional transactions that we have to ignore by creating functionality. It also increases the chance of errors.

 


6 replies

Badge

Question, if I have data where the Units sold are in each column, where the column is by month.  Should I make it so the months are in one column so they can be a dimension?

Userlevel 5
Badge +7

@Paylward Yes - Ideally the best format is to have all months within one column.

We do have the ability in Pigment to upload your described format (pivoted view), however, you may have to update your import configuration when you change the file structure (for example if you add additional months / years in future, the configuration would need to be changed). 

For a sustainability point of view its better to have all months in one column (although both file formats are accepted)

Badge

thank you that is great.  One more question. Let’s say I have the raw data in the right format, but the units sold is by Product Group.  I have the Products underneath each product group and the price of each product, and the % of units sold of each product.  I need to map out and show how the product groups are broken out down to product level by that % break out.  Then need to add price to get revenue.  Should I do that at the beginning, or use pigment to do that?

Userlevel 5
Badge +7

Hi @Paylward 

I would do the following:

  1. Create the following dimensions on Pigment
    • Product Group
    • Product (including mappings to group)
  2. In your CSV file, I would include only the lowest level data. No totals, subtotals, or calculations (No % of units sold)
  3. Your Source file should have for each row the Product, Month, No. of Units Sold, and Price per Unit.
  4. Upload this data into a Pigment Transaction list
  5. Create an aggregation metric
  6. Build your reporting metrics / table including calculations for Revenue and % of product group.
Badge

I do have those dimensions set up.  The source file has the units sold, dates, regions, countries….but by Product Group, not the lowest level which is Product.  Also as part of the file I have what the Product Group should break out in terms of each Units by Product….. is that the aggregation you are speaking of?  Is that an equation with one of those modifiers?

Userlevel 5
Badge +7

Hi @Paylward 

You can either calculate the Product level data before loading into Pigment and then follow the steps above, or you can do those calculations on Pigment.

If you were going to do those calculations on Pigment:

  1. Create the following dimensions on Pigment
    • Product Group
    • Product (including mappings to group)
    • KPI Type (two items: Price, % Allocation)
  2. Create two CSV Files
    • Product Group File to be loaded into a transaction list
    • Product Allocation file that includes the product information (Price and % Allocation). This needs to be imported into a metric by KPI Type and Product
  3. Upload the data into Pigment
  4. Create an aggregation metric  from the transaction list (to aggregate the values to be by the reporting dimensions using the BY modfier). This metric would be the quantity by Product Group
  5. Create an allocation metric that multiplys the quantity by product group by the allocation % metric to get the Quanity by product.
  6. Create another metric which is Revenue by Product. This would multiply the price by product KPI by the quantity by product metric.
  7. Build your reporting metrics and tables

Reply