De-duplicate rows in PowerBI – but keep the summary

I needed to solve a problem in PowerQuery / PowerBI for summarising totals in a table colum and assigning the values into another table without repeating these values on each row.

In our reporting API, we have line items for both Purchase Orders and Invoices in one table. They are classified as Type 1 - purchase order or Type 2 - invoice. What I wanted to do is add them up and arrive at the order / invoice balance. We offer this anyway in our Order tables but I was trying to figure out another solution. And as with all things PowerQuery, and with a steer from ChatGPT, there is a solution and it goes like this. 

Summarise Values

The line items table has everything I need. So step 1, having cleaned up the dates and some other data cleansing tasks, I duplicated the lines table. I filtered for purchase orders, type 1, and grouped by purchase order number, summarising the NetValue of the line. My table is then effiectively 

PurchaseOrderNumber OrderValue
1001 1950.00

I repeat this for the Invoices, changing Type to 2. Now I have a similar table again with PurchaseOrderNumber and the NetValue of any invoice lines

PurchaseOrderNumber InvoiceValue
1001 1850.00

I merged these two tables together so that we have side by side values of order and invoice and a calculated column for the balance. 

Next up, I wanted to get these values into the Orders table. So I Merged, linking on the OrderNumber. 

Now another Merge of Order into the LineItems table so every line item has these new values, linking on the purchase order number. 

Next up, I added an Index to the LineItems table. And the next bit is the celever bit that gives me a unqiue balance value  on only 1 of the line items. I created an index column to the LineItems table. 

I duplicated the lineItems table again. 

I then removed duplicates based on the PurchaseOrderNumber and the new Balance field. I also removed type 2 lines - Invoice Lines. 

And the master stroke, merge the de duplicated LineItems table back into the LineItems table, joining on the Index field. And now I have my Line items table with a unique value of the balance on only one of the lines. An example below 

PONumber Price  QTY Description Net OrderSum InvoiceSum Balance  IndexID
1001 20.00 10 Widgets 200.00 325.00 250.00 75.00 500
1001 20.00 5 Bodgets 100.00       501
1001 25.00 1 shipping 25.00       502

 

Like all things PowerQuery, there is a solution here, and for me it was the addition of the Index column. I saw that in a thread somewhere and pursued it and that's what allows the de duplication. This might not be the most efficient use of PowerQuery resource with the duplication of tables but it's a means to an end and solved the problem. 

 

Tags: 

PowerQuery, PowerBI, ExCel, Deduplication of  columns, summarising values 

Previous

New Accounts Payable Starter Checklist

Next

Accounts Payable Workflow Automation: Getting Started

  • Sun
  • Mon
  • Tue
  • Wed
  • Thu
  • Fri
  • Sat
    Schedule your exclusive software demo today!