Creating Maintainable Power BI Report Models – Part 1

Power BI Kitchen

Professional chefs use a French kitchen organization approach called “Mise En Place”, which means literally, “putting in place.” It ensures that the placement of ingredients and utensils are optimally placed for efficient cooking. Power BI report models can require the same type of approach to ensure that your report models are maintainable over time. This applies whether you are doing reporting against Microsoft Project Online, Microsoft Project Server or Office 365.

We’ll focus on three techniques for the report developer’s back end configuration this week. We’ll cover the front end configuration next week as it relates to improving the end user experience.

Fast is not always good

Power BI allows you to accomplish a lot quickly when it comes to transforming data. However, if you aren’t smart about how you approach your back end configuration, you could be creating a maintenance nightmare for yourself.

Do you know what is what in a report model that you haven’t touched in months? These three techniques will enable you to easily understand and work with your transformation code over time.

What was I thinking when I did this?

This is a common problem for any developer who has ever done work in a rush. The reason for doing what you did in the fifteen transformation steps was blatantly clear the day you created the model. However, the sands of time have worn away the memories and now you are doing code archaeology to understand your previous work.

Power BI allows you to make your transformation steps self-documenting.

To make your transformation steps descriptive, do the following.

  • Open the query editor
  • Select a data set in the left pane. The list of transformation steps are shown in the Applied Steps pane on the right.
  • Right-click (#1) on transformation step
  • Rename the step with something descriptive

Power BI Renaming a Transformation

Instead of accepting the default “Grouped Rows” step name, you can rename the step to read “Grouping by Year, Week, Resource to provide weekly work totals.” Doing so creates a coherent story of what is happening and why.

Power BI Self Documenting Transformations

What is this data set again?

Power BI provides a default data set name, based typically on the source of the data retrieved. This may not be desirable as the data set may be an amalgamation of data from several sources and have a different intent than that indicated by the first data retrieved. Secondly, this name also appears to the end user and may have no meaning.

To make your data set name descriptive

  • Open the query editor
  • Select a data set in the left pane
  • On the right side, in the Properties Name box above the Applied Steps, rename the data set to be more meaningful to the user
    • If you are using our Conversation-centric design approach, you should already have some candidate names available
  • To update, simply type over the value in the Name field and click off of the field to save
  • When you save the model, the change will be saved

Below, we renamed AssignmentTimephasedDataSet to Resource Work Summary by Week.

Power BI Naming Data Sets

Where did I put that thing?

Another challenge that you may encounter is trying to find a query data set when you have several in a report model. Power BI allows you to create groups and assign your data sets and functions to a specific group.

If I am working on model that is in production, I can create a Version 2 group, copy the data set, move the copy to the Version 2 group. This way, I have both copies and can ensure I don’t accidentally change the wrong version.

In the example below, I used groups as a way of organizing my demos for a webinar. Each group demo number corresponded to a PowerPoint slide so it was easy to keep my place.

Power BI Using Groups

To create a Power BI query group

  • Right click anywhere on the Query Editor left pane
  • Select New Group…
  • Fill out the name and the description
  • Click OK

The description can be seen when you hover over the group name. It is very helpful for providing additional context.

To move a data set to a Power BI query group

  • Right click on the data set name
  • Select Move to Group
  • Select the Group

These three techniques will help you keep things organized and understandable. Next week, we’ll discuss best practices for enabling a great end user experience.

3 Ways to Rev Your Microsoft Project Online and Power BI Performance (Number 2 Is Our Favorite)

Are you having tired of waiting for long refresh times with Power BI? Perhaps, you are getting timeouts because you are retrieving too much data. There’s three easy ways to avoid these issues.

WHY IS MY POWER BI DATA REFRESH SO SLOW?

Many Power BI models refresh slowly as the models are not structured to take advantage of query folding.

Query folding is a process by which specific Power BI transformations are executed by the data source instead of Power BI. Allowing those actions to occur at the source means less data has to be sent to Power BI. Less data means faster data refresh times and smaller report models. Note, not all data sources support query folding, but oData for Project Online and SQL Server for Project Server do.

A sample of these foldable actions include

  • Column or row filtering
  • Group by and aggregation
  • Joins
  • Numeric calculations

For example, you need to find out the amount of work scheduled by week for the quarter. You are querying the time phased data from Project Online. If you aren’t careful, you may be retrieving hundreds of thousands of records. Query folding will make a huge difference in speed and the number of records retrieved. If you have the records filtered by Project Online before retrieving them, you may only receive a few thousand records instead.

ISSUE #1: NOT REFERENCING THE DATA SOURCE PROPERLY

This issue occurs primarily using oData sources, such as Project Online, Office 36 and other web based sources. Query folding breaks if you don’t properly reference the oData feed.

In order for query folding to work properly, the transformations that have folding support need to be the first things executed. If a non-folding transformation is added, no subsequent transformation will be folded. If you don’t reference the oData feed properly, the step to navigate to the oData feed isn’t foldable, therefore blocking all subsequent transformations from being folded.

ISSUE #2: DOING DYNAMIC DATE SELECTION IMPROPERLY

This issue causes more headaches with Project’s time phased data than any other action. In many cases, you are looking to create a range of dates for Power BI to query. If you use the Between filter in the Power BI Query Editor, there’s not option for Today’s date +/- a number of days. If you use many of the other Date/Time filters, like Is Latest, you still seem to pull back a lot of data.

Solving this particular issue requires getting familiar with the M query language so that you can understand how Power BI Desktop performs specific actions.

For example, let’s look at the Is Latest date selection. By default, your dataset is unsorted so Power BI creates a list of the values in the selected column and does a Max value. While this is correct, this could result in a lot of records being retrieved to perform a Max and get one record. The M code over a time phased table looks like this:

= Table.SelectRows(#”Filtered Rows”, let latest = List.Max(#”Filtered Rows”[TimeByDay]) in each [TimeByDay] = latest)

To get much better performance, make the following two changes. First, sort the dataset in a descending manner on the column which you are using to decide latest. In this example, that’s TimeByDay. Sorting is a foldable action so the data source will do the work.

Next, change List.Max to List.First. Since the latest date is the first record in the sorted dataset, a lot less data is required to get the answer. So, my statement is now = Table.SelectRows(#”Filtered Rows”, let latest = List.First(#”Filtered Rows”[TimeByDay]) in each [TimeByDay] = latest)

In testing, the original way required over a 1 Mb of data to be retrieved to answer the question. The new way only retrieves 127 Kb.

ISSUE #3: URLS ARE TOO LONG

This issue comes into play when working with Project Online and SharePoint based data sources. Project, in particular, has a lot of long field names. SharePoint has a limit of 2048 characters for a URL. If you are manually selecting a lot of field names, you can accidentally go past the 2048 character limit.

What this means is that Power BI receives a URL that can’t be acted upon. The default Power BI behavior is to then simply retrieve the feed without modifiers. If this happens on a source with many rows, this could significantly impact performance.

In this case, you would break up your dataset into two or more to fit within the limitation and then merge them back together in Power BI.

WANT TO KNOW MORE?

Join us for a free Power BI performance clinic on March 22! This session will do a deep dive into these issues as well as show you how to determine when folding is happening (and not.) Go here to register!

 

[imageframe lightbox=”no” lightbox_image=”” style_type=”none” hover_type=”none” bordercolor=”” bordersize=”0px” borderradius=”0″ stylecolor=”” align=”none” link=”http://academy.tumbleroad.com/courses/power-bi-performance-clinic” linktarget=”_blank” animation_type=”0″ animation_direction=”down” animation_speed=”0.1″ animation_offset=”” class=”” id=””][/imageframe]

See Cortana and Power BI Integration In Action

My Star Trek moment has finally arrived!

Growing up, it always seemed like a pipe dream to be able to ask questions of the computer and have her answer. Microsoft has got us the closest to this dream with Cortana and Power BI Integration. You can now summon answers from your Power BI dashboards, with the power of your voice.

Attached is a very short demo and hopefully you can see this is just the beginning. I’m working on a more extensive demo and will be presenting this functionality next week at SharePointFest Chicago.

[tagline_box backgroundcolor=”#508ebb” shadow=”no” shadowopacity=”0.7″ border=”1px” bordercolor=”” highlightposition=”top” content_alignment=”left” link=”http://bit.ly/BITrends5″ linktarget=”_blank” modal=”” button_size=”xlarge” button_shape=”” button_type=”” buttoncolor=”orange” button=”Click for More Details” title=”Five Key Trends in Business Intelligence” description=”See our five-star audience rated Five Key Trends in Business Intelligence session for only $49.
Use code get50offall at checkout. Click the button for details.” margin_top=”” margin_bottom=”” animation_type=”0″ animation_direction=”down” animation_speed=”0.1″ animation_offset=”” class=”” id=””][/tagline_box]

How To Use Your Free PowerBI.com license with Microsoft Project Online

You can use the free PowerBI.com license with Microsoft Project Online to do your basic reporting. There are caveats but I’ll show you ways that you can use the free license effectively.

What is Power BI?

Microsoft Power BI is a collection of online and client tools that enable organizations to visualize and explore data, share insights and allow consumption of these insights from multiple platforms.

PowerBI.com is the home site for the service and serves as the portal to see your dashboards. Power BI supports three primary entities:

  • Dashboards – A tile based way of showing you important information in a glance-able and easy to use fashion. Dashboards also support natural language querying, where the viewer can type in a question and have Power BI build a new visualization automatically.
  • Reports  – Are similar to dashboards in that they are tile based. However, reports are more interactive in that clicking on a bar in a chart filters everything on the report. A given Power BI project can have up to 10 reports.
  • Datasets – These are sets of data that have been transformed into business-usable formats.

Power BI is available on all mobile platforms via mobile applications in the platform app stores.

It’s Free?

Yes! Awesome isn’t it! When Microsoft last updated the Power BI pricing plans, they came out with a free license tier. To get your license, go to http://PowerBI.com and login with your Office 365 account. Note, you can sign up with other accounts but if you want to use this license with Project Online, you have to sign up with the account with which you access Project.

Now What?

You need to download and install Power BI Desktop to create and publish your BI content. Think of Power BI desktop as “Word for Reporting”. You put it all together in Desktop but this isn’t where you would typically consume your BI content.

The Approach

Assumptions

For this approach to work, three things have to be true

  • Your report consumer wants to consume the data but desires accessibility over the ability to customize the views
  • The total amount of data you need to make available is less than 1G
  • Automatic daily refresh of data is acceptable

The second point is important as the data stored counts against both the creator of the visualizations and the consumer. The last point is the norm for most Project clients as most information is updated weekly.

Tactics

Publish the Report Pack

  • If you aren’t already signed up for PowerBI.com, create your free account using the same account you access Project Online.
  • If you haven’t already done so, download and install Power BI Desktop,
  • Create your Project visualizations as you normally would.
  • When finished, publish it to PowerBI.com using the Publish button. image

Configure Auto-Refresh

  • Open PowerBI.com
  • Hover over the dataset for your report, in this case, Project Report
  • You will see an ellipsis button appear on the right. click it
  • Select Schedule Refresh
  • Expand Data source credentials and enter your Office 365 credentials for Project Online
  • Expand the Schedule Refresh section and set up when you wish the data refresh to happen
  • Click Apply

Share the Report

Content Packs, which is a way of sharing report packs that allows the report consumer to customize the reports to their liking. However, creating and consuming Organizational Content Packs is a Pro only feature. You can consume the Microsoft supplied content packs with the Free license.

This is where the requirement that the end user is a simple consumer comes into play. The free license level can share their dashboards and reports with other free users. It simply makes the dashboard read only. For those tiles that support drilll-through, you can still access the reports.

To Share

  • Select your dashboard in the Dashboards group on the left pane
  • In the upper right corner of the screen, select the Share button
  • Enter either individual emails or preferably, AD groups with which to share the dashboard. These email addresses have to be within your organization.
  • Click Share

By default, everyone who is on the share list will get an email. The Report consumer will now see the shared dashboard under their dashboard section when they log into PowerBI.com. If they are using any of the mobile PowerBI clients, they will see the new shared dashboard there as well.

Please note, if you change the dashboard, the people with which it is shared will see the updates immediately.

Microsoft Project Online Timesheet Reporting using Power BI Desktop

PowerBI Timesheet Dashboard

Visualizing Project Time

Project Online supports collaborative work management across your organization. It provides the ability to manage multiple project schedules and allows project team members to enter the time spent on assigned tasks via their timesheet. The timesheet information can then be used for many purposes.

Many Project Management Offices (PMOs) have dynamic information needs that require data interactivity to uncover the sought out answer. Power BI Desktop is the perfect tool to enable the data exploration process. In this article, we’ll focus on the PMO need to audit timesheet data, exploring who has submitted time and what types of time were submitted. Below is an example of an overall timesheet status dashboard, created in Power BI.

 

For those unfamiliar with Project, timesheet submission is generally a weekly process. The Project Manager publishes the latest project version. At the end of the week, project team members log how much time they spent on assigned tasks and submit their timesheet for approval. Project Managers then review and approve the timesheet submissions. Once this process is complete, status reports and other key data visualizations use this updated timesheet information to support organizational needs.

Figure 1 Weekly Timesheet Process

Success takes the whole village

Getting everyone to submit their timesheet in a timely manner is key to getting full value from time tracking. Thus, auditing timesheet submissions is an integral part of a successful time tracking implementation.

In a perfect world, everyone would submit their timesheet on time. The team member would get value from the data that supports their work. The organization would also get value from having data in a timely manner with minimal capture effort.

The village faces challenges

Many organizations face three time tracking challenges.

First, it’s difficult to get the entire organization to fill out their timesheet in a timely manner. Sometimes, you aren’t in the office on Friday so the timesheet doesn’t get submitted. Some people will simply forget to submit their time. Others may be more passive aggressive in that they view time tracking as a tax with little individual benefit.

Second, Project has no included functionality for easily auditing timesheet submission. If you want to use time tracking, you have to build your own business intelligence views to meet your analytic needs.

Lastly, reporting tools of years past focused on solving a particular business problem. They were not optimized for ad hoc querying to meet immediate and specific data needs. Below is an example of a dashboard for investigating where timesheets have gone missing.

PowerBI Timesheet Investigate Dashboard

 

This article will take you through the process of defining your business need, show you how to leverage a social design approach to design the solution and take you through the steps to address the need in Power BI.

Designing for Business Intelligence

Business intelligence answers the questions that support specific organizational conversations. Experience in BI Design indicates that mapping the social interactions first, will increase the chances that the designed visualizations will meet the business need.

I use Tumble Road’s Conversation-Centric Design (CCD) approach for designing Business Intelligence solutions. CCD helps you derive the technical design details by starting with the modeling of the targeted social interactions. This process will identify and define three key aspects: the audience, the conversations and the key questions. Once these aspects are defined, you can use these to define the data scenarios, individual data elements, common user terminology, and supporting processes.

Figure 2 Conversation-Centric Design Process

CCD can also be used as a scoping framework. For example, you can choose not to support a conversation and be able to track all aspects impacted by that decision. It also makes it easier to create a project Work Breakdown Structure based on Audience:Conversation:Key Questions structure, allowing the rollup of task progress to provide the health of each aspect automatically.

Requirements and Design

For this exercise, I’m limiting the design to one primary audience, the Project Management Office, and three key questions for the PMO.

Requirements

User Scenario

The previous week’s timesheet results for projects and administrative time are reviewed in the Monday morning PMO team meeting. Claire, the director, leads the meeting, where she reviews the results with Stephanie and Ian. The data is reviewed by open timesheet period. The outcome of the meeting is to follow up with the individual team managers to complete the process. Claire also uses the data to generate PMO reports to the company vice presidents. Therefore, she wants to have complete data as much as possible to generate the reports.

The organization uses administrative time categories so all timesheet eligible resources are expected to turn in a timesheet weekly even when the person is not currently assigned to a project. The PMO is only concerned with total time charged to a project. Only one time category is generally received, Actual Work Billable, but due to an intern incident earlier in the year, all Actuals will be summed together to ensure all time is captured. The timesheet periods are defined as running from Sunday to Saturday each week.

Jason and Sophie need the timesheet data for their own uses with Finance and HR but are passive observers to the review process.

pmo

Figure 3 Requirements in CCD format

Conversation

The business intelligence visualization is needed to support the PMO weekly review of timesheet data at 9AM on Mondays.

Audience

The PMO consists of five people, one director and four analysts. Claire, Stephanie and Ian are direct consumers of the data and are the primary audience of this exercise. Jason and Sophie are indirect consumers of the data. Claire, Stephanie and Ian, therefore, will be best choices for assessing whether the visualization meets their needs. Stephanie and Ian can also verify the data.

Key Questions

From the scenario above, we can derive the following key questions. NOTE: There are typically more than three but I find three is a great starting point.

  • Have all timesheet eligible resources submitted their timesheet?
  • Have all submitted timesheets been approved?
  • Which teams have outstanding timesheets?

Technical Design

Supporting Data

The supporting data will denote three states for timesheet eligible resources. NOTE: Project refers to people as resources. Therefore, resources and project team members are interchangeable for this article.

  • Have not created a timesheet
  • Submitted a timesheet, but timesheet is still in process
  • Timesheet was submitted and approved

The data set will be aggregated to the weekly level of granularity by resource and project. Only the last 60 days of data will be reviewed.

Data Sets

The primary data sets necessary to create the requisite visualizations are as follows.

Timesheet eligible resources are defined as work resources in Project that can log into the system, have an email address and are capable of working in the analyzed time frame.

  • Identify and retrieve timesheet eligible resources using the following conditions
    • Work resources – Resource Type is 2
    • Resource is active in the system – ResourceIsActive = 1
    • Who can log into the system – ResourceNTAccount is not null
    • Are capable of working in the analyzed time frame – Resource Base Capacity is greater than zero.
  • Identify and retrieve the submitted timesheets with associated project information for the last 60 days for timesheet periods that are open.
    • Timesheet Period Start Date is within the last 60 days.
  • Identify all timesheet eligible resources who did not create a timesheet
    • This data will need to be derived using the other two data sets.

Visualizations

In order to answer the key questions, a timesheet status field is needed to hold the state. The following visualizations are needed to show the core data.

  • Listing of each resource by timesheet period with the current timesheet status submission
  • Stacked bar chart by timesheet period with a count of timesheets in each timesheet status
    • Provides a visual indication of resource level submission completeness.
  • Stacked bar chart by team with a count of timesheets in each timesheet status
    • Provides a visual indication of team level submission completeness.
  • Stacked bar chart by resource with a count of timesheets in each timesheet status
    • Provides a visual indication of resource level submission completeness.
  • Amount of time logged per project per timesheet period
  • Slicers will be required for timesheet period, team and timesheet status.

Data Elements for Visualization

This is the list of fields needed to generate the final visualizations. Other data will be needed in the process to generate the underlying data sets.

Project OData feed Field Definition Common Terminology
Resources ResourceDepartment This is the department to which the resource is a member Department
Resources Resource Email Address Work resource email address Email
TimesheetLine PeriodEndDate End date of timesheet period End
TimesheetLine ProjectID Project ID for linking to other project information Hidden from user
Resources ResourceID Resource ID for linking to other resource information Hidden from user
TimesheetLine TimesheetApproverResourceName The person responsible for approving the person’s timesheet Manager
Resources Resource Name Name of work resource Name
Calculated Index Needed to ensure unique key People
TimesheetLine TimesheetPeriodName In this example, the period name follows this pattern: 2015-W01. Timesheet periods are 7 days long. Period
TimesheetLine TimesheetPeriodStatus Open or Closed to timesheet submission Period Status
TimesheetLine PlannedWork Amount of time originally planned for the work Planned
TimesheetLine ProjectName Name of the project to which time is billed Project
TimesheetLine PeriodStartDate Start date of timesheet period Start
TimesheetLine TimesheetStatus The process status for a specific timesheet Status
Resources TeamName This is the team assignment pool to which a resource may be assigned Team
TimesheetLine [ActualOvertimeWorkBillable] +[ActualOvertimeWorkNonBillable] +[ActualWorkBillable] +[ActualWorkNonBillable] Amount of time entered against a project Time
TimesheetLine TimesheetPeriodName In this example, the period name follows this pattern: 2015-W01. Timesheet periods are 7 days long. Year

Supporting Processes

  • Weekly Time tracking process

Tools

  • Power BI Desktop
  • PowerBI.com

Building the Data

Data Feeds

There are four OData feeds necessary to build the timesheet audit report.

Resources

The resources feed provides the data that is visible via the Resource Center in Project Web App. This feed provides access to all resource custom fields.

ResourceTimephasedDataSet

This feed provides access to the resource capacity data at a day level of granularity. The Max Units field in the Resource impacts the capacity available by day. If Max Units is set in PWA, all days will have the same capacity. Project Pro provides the additional capability to create different Max Units for specific date ranges.

TimesheetPeriods

This feed provides access to the date ranges per week for which timesheets are expected.

TimesheetLines

This feed provides the summary data as to what time was logged by the project team member.

Transformations

In order to build out the dataset for the visualization, the following transformations are required.

  1. Identify and retrieve timesheet eligible resources using the following conditions
    1. Work resources – Resource Type is 2
    2. Who can log into the system – ResourceNTAccount is not null
    3. The result of this step is the potential list of resources who can submit a timesheet
    4. NOTE: Typically, you could use ResourceIsActive for steps 2 and 3. Be aware that the value changes if you turn on the new Resource Engagements feature. Feel free to change this in the code based on your individual configuration.
  2. Join the Resources with the Resource Capacity records to determine if a person worked for a specific week
    1. Are capable of working in the analyzed time frame – Resource Base Capacity is greater than zero.
    2. The result of this step is the actual list of resources expected to submit a timesheet for the selected timesheet periods
  3. Create a record for each timesheet eligible resource in step 2 for each timesheet period within the 60 day range. This is similar to a cross join in SQL.
    1. This is required as no timesheet record is created until the resource accesses the timesheet. Therefore, if resource never clicks the timesheet link, there’s no way to directly see that the timesheet wasn’t created.
    2. The result of this step is the master list of all timesheets expected for the selected timesheet periods.
  4. Join the submitted timesheets with the master timesheet list from step 3.
    1. The result of this step is the status of all timesheet submissions and non-submissions.

dataaccess

Figure 4 Order in which to perform the data transformations

Power BI Transformation Code

Resources

This M code calls the Resources OData feed and filters the records for only those defined as timesheet eligible. The columns are then limited to only those necessary to support the end visualizations.

let
Source = OData.Feed(“https://YOURSERVERNAME.sharepoint.com/sites/YOURPROJECTWEBAPP/_api/ProjectData”),
Resources_table = Source{[Name=”Resources”,Signature=”table”]}[Data],
#”Filter for Work Resources who are Users” = Table.SelectRows(Resources_table, each ([ResourceType] = 2) and ([ResourceIsActive] = 1) and ([ResourceNTAccount] <> null) and ([ResourceEmailAddress] <> null)),
#”Get core fields” = Table.SelectColumns(#”Filter for Work Resources who are Users”,{“ResourceId”, “ResourceCreatedDate”, “ResourceEmailAddress”, “ResourceName”, “TeamName”, “ResourceDepartments”})
in
#”Get core fields”

ResourceTimephasedDataSet

One of the requirements was to only expect a timesheet when the resource had available capacity. This requires a look at the data in a timephased manner.

This M code calls the ResourceTimephasedDataset OData feed and performs the following transformations. Descriptive comments for the statement following are denoted with //. Also, each step has been renamed to provide clues as to its function, making the process somewhat self-documenting.

let
Source = OData.Feed(“https://YOURSERVERNAME.sharepoint.com/sites/YOURPROJECTWEBAPP/_api/ProjectData “),
ResourceTimephasedDataSet_table = Source{[Name=”ResourceTimephasedDataSet”,Signature=”table”]}[Data],

// Timephased datasets are usually huge, only get the last 60 days.
#”Filter for Last 60 Days” = Table.SelectRows(ResourceTimephasedDataSet_table, each Date.IsInPreviousNDays([TimeByDay], 60)),

//Since we only care about resources who could have worked, we want to

//examine Basecapacity for any instance greater than zero.

//Basecapacity is calculated using the resource’s MaxUnits value
#”Filter for BaseCapacity > 0″ = Table.SelectRows(#”Filter for Last 60 Days”, each [BaseCapacity] > 0),

//Need to group data by week, so insert new column for week of year
#”Inserted Week of Year” = Table.AddColumn(#”Filter for BaseCapacity > 0″, “WeekOfYear”, each Date.WeekOfYear([TimeByDay]), type number),

//Once grouped, join with Resources feed from previous

#”Merge with Resources” = Table.NestedJoin(#”Inserted Week of Year”,{“ResourceId”},Resources,{“ResourceId”},”NewColumn”,JoinKind.Inner),

// We need to expose the resource created date so that we can

// eliminate any capacity prior to the date of resource creation
#”Expand to see Resource Fields” = Table.ExpandTableColumn(#”Merge with Resources”, “NewColumn”, {“ResourceCreatedDate”}, {“ResourceCreatedDate”}),

// Create a new column that uses Value.Compare to flag the nature

// of the relationship. A value of 1 means TimeByDay > Created Date
#”Flag Valid Capacity” = Table.AddColumn(#”Expand to see Resource Fields”, “ValidCapacity”, each Value.Compare([TimeByDay],[ResourceCreatedDate])),

// Eliminate any rows which don’t have a 1, which are all of the rows

// that exist before the resource was created
#”Filter for Valid Capacity Only” = Table.SelectRows(#”Flag Valid Capacity”, each ([ValidCapacity] = 1)),

// Group the result set by week number
#”Group by ResourceID_WeekNumber” = Table.Group(#”Filter for Valid Capacity Only”, {“ResourceId”, “WeekOfYear”}, {{“Count”, each Table.RowCount(_), type number}, {“Capacity”, each List.Sum([BaseCapacity]), type number}, {“Startd”, each List.Min([TimeByDay]), type datetime}})
in
#”Group by ResourceID_WeekNumber”

Timesheet Lines

This M code retrieves the time logged against the individual projects.

let
Source = OData.Feed(“https://YOURSERVERNAME.sharepoint.com/sites/YOURPROJECTWEBAPP/_api/ProjectData”),
TimesheetLines_table = Source{[Name=”TimesheetLines”,Signature=”table”]}[Data],
#”Removed Other Columns” = Table.SelectColumns(TimesheetLines_table,{“TimesheetLineId”, “ActualOvertimeWorkBillable”, “ActualOvertimeWorkNonBillable”, “ActualWorkBillable”, “ActualWorkNonBillable”, “CreatedDate”, “PeriodEndDate”, “PeriodStartDate”, “PlannedWork”, “ProjectId”, “ProjectName”, “TimesheetApproverResourceId”, “TimesheetApproverResourceName”, “TimesheetName”, “TimesheetOwner”, “TimesheetOwnerId”, “TimesheetPeriodId”, “TimesheetPeriodName”, “TimesheetPeriodStatus”, “TimesheetPeriodStatusId”, “TimesheetStatus”, “TimesheetStatusId”}),

// Only get the last 60 days of timesheet lines
#”Filtered Rows” = Table.SelectRows(#”Removed Other Columns”, each Date.IsInPreviousNDays([PeriodStartDate], 60))
in
#”Filtered Rows”

Timesheet Periods (Renamed to Timesheet Analytics)

This becomes the primary dataset for user facing visualizations. Everything is predicated on timesheet period so it makes sense to orchestrate the rest of the data around this data.

let
Source = OData.Feed(“https://YOURSERVERNAME.sharepoint.com/sites/YOURPROJECTWEBAPP/_api/ProjectData”),
TimesheetPeriods_table = Source{[Name=”TimesheetPeriods”,Signature=”table”]}[Data],

// Filter records for last 60 days
#”Filter for last 60 days” = Table.SelectRows(TimesheetPeriods_table, each Date.IsInPreviousNDays([StartDate], 60)),

// Cross Join of Timesheet Period and Timesheet Eligible Resources

// Creates a master list of all expected timesheets
#”Added Resource for every Period” = Table.AddColumn(#”Filter for last 60 days”, “AllResources”, each #”Resources”),

// Show all relevant resource fields
#”Expand to show Resource Fields” = Table.ExpandTableColumn(#”Added Resource for every Period”, “AllResources”, {“ResourceId”, “ResourceCreatedDate”, “ResourceEmailAddress”, “ResourceName”, “TeamName”, “ResourceDepartments”}, {“ResourceId”, “ResourceCreatedDate”, “ResourceEmailAddress”, “ResourceName”, “TeamName”, “ResourceDepartments”}),

// Left join to match submitted timesheets with

// master list of expected timesheets
#”Merge with TimesheetLines” = Table.NestedJoin(#”Expand to show Resource Fields”,{“PeriodId”, “ResourceId”},TimesheetLines,{“TimesheetPeriodId”, “TimesheetOwnerId”},”NewColumn”,JoinKind.LeftOuter),

// Show the timesheet status
#”Show Timesheet Status” = Table.ExpandTableColumn(#”Merge with TimesheetLines”, “NewColumn”, {“ActualOvertimeWorkBillable”, “ActualOvertimeWorkNonBillable”, “ActualWorkBillable”, “ActualWorkNonBillable”, “PlannedWork”, “ProjectId”, “ProjectName”, “TimesheetApproverResourceId”, “TimesheetApproverResourceName”, “TimesheetStatus”}, {“ActualOvertimeWorkBillable”, “ActualOvertimeWorkNonBillable”, “ActualWorkBillable”, “ActualWorkNonBillable”, “PlannedWork”, “ProjectId”, “ProjectName”, “TimesheetApproverResourceId”, “TimesheetApproverResourceName”, “TimesheetStatus”}),

// If timesheet status is null, there’s no matching timesheet lines

// Therefore, the timesheet must not have been created
#”Replace Timesheet Status null with Not Created” = Table.ReplaceValue(#”Show Timesheet Status”,null,”Not Created”,Replacer.ReplaceValue,{“TimesheetStatus”}),

// Join with resources that were working during this time
#”Merge with ResourceTimephasedData” = Table.NestedJoin(#”Replace Timesheet Status null with Not Created”,{“ResourceId”},ResourceTimephasedDataSet,{“ResourceId”},”NewColumn”,JoinKind.Inner),

// Per requirement, combine all actuals columns into one
#”Total up all actuals” = Table.AddColumn(#”Merge with ResourceTimephasedData”, “Time”, each [ActualOvertimeWorkBillable]+[ActualOvertimeWorkNonBillable]+[ActualWorkBillable]+[ActualWorkNonBillable]),

// For missing timesheets, replace nulls with zero
#”Replace Time nulls with zeros” = Table.ReplaceValue(#”Total up all actuals”,null,0,Replacer.ReplaceValue,{“Time”}),

// Rename columns to be aligned with user terminology

#”Renamed Columns” = Table.RenameColumns(#”Replace Time nulls with zeros”,{{“Description”, “Period Status”}, {“EndDate”, “End”}, {“PeriodName”, “Period”}, {“StartDate”, “Start”}, {“ResourceEmailAddress”, “Email”}, {“ResourceName”, “Name”}, {“TeamName”, “Team”}, {“ResourceDepartments”, “Department”}, {“PlannedWork”, “Planned”}, {“ProjectName”, “Project”}, {“TimesheetApproverResourceName”, “Manager”}, {“TimesheetStatus”, “Status”}}),

// Narrow down data set to only those field required

#”Removed Other Columns” = Table.SelectColumns(#”Renamed Columns”,{“Period Status”, “End”, “Period”, “Start”, “Email”, “Name”, “Team”, “Department”, “Planned”, “Project”, “Manager”, “Status”, “Time”}),

// Change dates to only show date, not datetime

#”Changed dates to only dates” = Table.TransformColumnTypes(#”Removed Other Columns”,{{“End”, type date}, {“Start”, type date}}),

// Add an unique index that will provide additional

// functionality in Q&A

#”Added Index” = Table.AddIndexColumn(#”Changed dates to only dates”, “Index”, 1, 1),

// Rename the index to use the People term

#”Renamed Index for Q and A support” = Table.RenameColumns(#”Added Index”,{{“Index”, “People”}})

in

#”Renamed Index for Q and A support”

Want More? Let us teach you Power BI.

‘); // ]]>

Heroic Business Intelligence is coming!

heroicbi banner

Thank you for the great response!

I created a video last week called the most interested time sheet audit report, mainly as a tongue-in-cheek way of introducing the cool functionality of Power BI over what is normally a dry reporting topic. The response was great! So great, that I’m putting together a hybrid class to teach people how to use the new Power BI.

For those about to crunch, we salute you.

Heroic Business Intelligence attendees should be those who really like to crunch the numbers. These may be Finance people, IT, Marketing or others. They don’t have to have deep technical skills. They also don’t have to be using Project as I’m not focusing strictly on Project for this class. They should have a need to analyze data for themselves and for others.

Hybrid > Traditional Classes

I’ve always hated it when I’ve crammed my brain full of new knowledge for a few days, only to lose 80% of it by the time I got back to work. So, I’m teaching this as a hybrid class. By that, the instruction will occur online live via a webcast with an interactive chat. All sessions will be recorded so if you can’t attend at the time, you can see the recording. You will also have access to the course material and any updates for as long as they are hosted. There will also be a private forum for attendees to share thoughts and ideas.

Peer Support

I attended a class structured like this last year. I still talk to my class peers nearly everyday. The network alone was worth the price of the class. The approach also allows you more time to absorb the content and apply it in your environment. I’m also hosting an Office Hour session each week, so that you can ask more detailed questions without having to do it during class.

Join Us Today, Registration Closes Aug 20th

I see this more as a journey and less as a class. If you want to join us, click here.

Converting RTF to Text in SQL Part 3

captain-picard-facepalm-meme

Yes, Part 3. This technique was published some time ago and as it gets further “field testing”, new conditions have come to light which require a revision of this technique. Assuming this is the last post on this topic, I need to figure out how to duplicate this logic in PowerBI.

I consider this to be version 15 or so of this query since my first version was written in mid-September of 2010. The changes here are as follows.

First, the DTD definitions have been expanded to include all possible HTML codes. This should prevent a report blowing up when someone decides to use an unexpected symbol somewhere.

Second, I keep seeing random <br> tags in the fields, which ironically is breaking the query. I’ve added a replace statement for that as well.

Lastly, I’ve expanded the field sizes to Max where necessary to accommodate truncation issues.

Sample code is below. I strongly recommend copying into NotePad first so that you don’t accidentally get curly quotes. I’ve also left it as straight text to make it easier to copy.

declare @Headxml nvarchar(3000)
declare @Footxml nvarchar(50)

set @Headxml = N'<?xml version=”1.0″?>
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
http://www.w3.org/TR/html4/loose.dtd”
[<!ENTITY quot “&#34;”>
<!ENTITY amp “&#38;”>
<!ENTITY lt “&#60;”>
<!ENTITY gt “&#62;”>
<!ENTITY nbsp “&#160;”>
<!ENTITY iexcl “&#161;”>
<!ENTITY cent “&#162;”>
<!ENTITY pound “&#163;”>
<!ENTITY curren “&#164;”>
<!ENTITY yen “&#165;”>
<!ENTITY brvbar “&#166;”>
<!ENTITY sect “&#167;”>
<!ENTITY uml “&#168;”>
<!ENTITY copy “&#169;”>
<!ENTITY ordf “&#170;”>
<!ENTITY laquo “&#171;”>
<!ENTITY not “&#172;”>
<!ENTITY shy “&#173;”>
<!ENTITY reg “&#174;”>
]><html><body>’

set @Footxml = N'</body></html>’

select   *
,ISNULL(LTRIM((CONVERT(xml,(@Headxml + replace([YourMulti-lineCustomField],'<br>’,”) + @Footxml),3).value(N'(/)’,’nvarchar(Max)’))),”)   AS [YourMulti-lineCustomFieldNewName]

,ISNULL(LTRIM((CONVERT(xml,(@Headxml + replace([YourMulti-lineCustomField2],'<br>’,”) + @Footxml),3).value(N'(/)’,’nvarchar(Max)’))),”)   AS [YourMulti-lineCustomFieldNewName2]

FROM  dbo.MSP_EpmProject_UserView

Is Your Data Looking for a Problem to Solve?

Data can be a wonderful thing. There are so many stories you can tell with the right data. These stories have the power to persuade and motivate positive changes in the organization. And so the marketing stories go on and on about the power of data.

The question is, do you have the right data? Can you tell compelling stories with your data? How do you know what is possible or needed? Let’s discuss some straightforward techniques to help you determine the answer.

Many times, I’ve walked into a client’s office and was greeted with large quantities of data. However, the client was still struggling to tell a compelling story from their data store. There was always doubt as to whether the reports they generated were useful or compelling.

I’ve also encountered clients who maintained additional data as they thought someone *might* need. Unfortunately, wishful thinking is not an effective business strategy and can waste scarce company resources.

Here are some signs that you might need to rethink your data strategy.

  • Do your reports require data definitions, report keys and long explanations for someone to understand what they are viewing?
  • Do you have a training session on “how to use the reports”?
  • Are you maintaining data for which you are unsure of the purpose or use of said data?

If you answered yes to any of these, it might be time for a little housecleaning.

The Effective SimplicityTM approach dictates that we maintain as little data as possible to meet the business need. Minimizing data is a way of reducing the overall cost by reducing the overhead to manage the system. The trick is to determine what data is needed which is sometimes easier said than done.

At Tumble Road, we use an approach that follows a pattern that is understandable and helps determine the context and the need for the right data.  The pattern is Conversation – Question – Supporting Data and Activities.

Conversation is about identifying the specific meeting / use case that the tool will ultimately support. A diagram of some standard Project Management conversations are illustrated below. The conversation will have a schedule so that you can determine how often the data needs to be updated. The conversation has standard participants, which is helpful if you need feedback on the data you will need to provide. It also helps later in the case where you need to inform the report consumers of an upcoming change. Lastly, a list of 1-3 key questions is defined for the use case.

Project Communications

Key Questions are the information needs that needs to be supported. The Key Question also determines the form in which the answer must be presented.

For example, if the conversation is the weekly Tuesday portfolio status meeting between IT Management and Finance, likely, you will need to answer questions similar to:

  • What have we spent?
  • What did we plan to spend so far?
  • What are we planning to spend in the near future?

Supporting Data and Activities are the exact data elements that allow the key question to be answered and the activities necessary to generate, collect and maintain said data. The data can help you determine other incidental data, specifically necessary for organizing and filtering the Supporting Data. The activities can help you spot process gaps that may be present that would prevent you from successfully addressing the question.

When examining the What have we spent question? above, Finance wants Project spend broken down by Project, Cost Type (Capital or Expense) and Sum the totals by Fiscal Year, Fiscal Period for each IT Director.

From this short exercise, I already know the following is needed:

    • A lookup table for Cost Type with values of Capital, Expense.
    • A task custom field with Assignment Roll Down of the values enabled.
    • A lookup table for IT Directors, to maintain consistency.
    • A Project custom field for IT Director so that this can be assigned to each project.
    • To add this Project custom field to the correct Project Detail Page so that the PM can maintain this data.
    • Resource rates in the system so that costs can be automatically calculated.
    • To enter the Fiscal Periods in Project Online.
    • A project template which exposes the Cost Type column at the task level.
    • The cross-tab report layout
Director/Project/Cost Type FY2014-M01 FY2014-M02 FY2014-M03 FY2014-M04 FY2014-M05
John Smith

$100

$100

$100

$100

$100

Project X

$100

$100

$100

$100

$100

Capital

$50

$50

$50

$50

$50

Expense

$50

$50

$50

$50

$50

 

As you can see, you can generate quite a bit of actionable detail using this approach. There are several follow-up questions that can now be asked since you have specific, concrete examples from which to work. For example:

  • Do you need filtering by director?
  • Do you show only incomplete projects?
  • Is the IT Director data something that should be maintained by PMs?

This approach also aids in designing supporting processes. You already know this is a weekly review, so cost information has to be updated on a weekly basis by the PM. As the meeting happens on Tuesday, Monday is the deadline to get updates in and published.

A final advantage is that it is easy to track the progress of visible work by your users. For these engagements, the top level summary task would represent the conversation, a key question represents a sub-tasks and key activities, such as gathering required data, defining maintenance process, etc. are the third level tasks. This makes it easy to determine the “health” of the conversation implementation.

Once you are maintaining the data essential to answering the questions, with clearly defined uses, you should see a reduction in overhead as well as evidence of easier training conversations. If you would like to learn more, please join our Community here.

Querying Multi-Value Custom Fields

thumbsup

Scenario

You have a report where the need is to show multiple values for a given custom field. For example, you have a multi-value Project custom field for Impacted Business Organizations.

You want to see your values as a comma delimited list so that this can be used in an Excel pivot table or SSRS tablix report. You might need something like:

Project Impacted Business Orgs
Project XYZ IT, HR, Operations

 

The Background

When a Project text custom field with associated lookup table is made multi-value, a number of changes are made in the Reporting Database. First, the field is removed from the MSP_EpmProject_UserView as that view only supports single select Project text custom fields with associated lookup table. Second, a new Association View view is created which has the following naming convention: MSPCFPRJ_YourMultiValueCustomFieldName_AssociationView

MSP prefixes all Microsoft views, CF for Custom Field and PRJ for the Project entity. This association view contains a record for each of the multiple custom field values selected, linking the Project record to the lookup table values in the MSPLT_YourMultiValueCustomFieldLookUpTableName_UserView view. LT in this case, stands for lookup table, so there is a MSPLT view for each lookup table in the Reporting Database.

This mechanism was first documented in the Project Server 2007 Report Pack that I wrote and can be found here: http://msdn.microsoft.com/en-us/library/office/bb428828(v=office.12).aspx  The Portfolio report also provides another way to utilize the multi-value field.

The Query

This query uses the XML functionality to build the concatenated string, based on a technique documented on StackOverflow here.

Once I modified the STUFF statement for specific use for Project Server, I wrapped it with an outer SELECT to combine it with all of the data from MSP_EpmProject_UserView. Note, if you have multiple multi-value fields, you will have to duplicate this inner piece in the parentheses for each field. The places to replace with your own field names are highlighted.

SELECT MSP_EpmProject_UserView.*
             , MVList.[YourMultiValueCustomFieldNameValues]
FROM MSP_EpmProject_UserView
INNER JOIN
   (SELECT   MSP_EpmProject_UserView.ProjectUID 
            ,ISNULL(STUFF((SELECT ', '+ MemberValue 
    FROM [MSPLT_VP Lookup_UserView] 
    INNER JOIN [MSPCFPRJ_YourMultiValueCustomFieldName_AssociationView] 
    ON [MSPLT_YourMultiValueCustomFieldLookUpTableName_UserView].LookupMemberUID = 
    [MSPCFPRJ_YourMultiValueCustomFieldName_AssociationView].LookupMemberUID
    WHERE [MSPCFPRJ_YourMultiValueCustomFieldName_AssociationView].EntityUID = 
    MSP_EpmProject_UserView.ProjectUID 
    FOR XML PATH(''), TYPE
    ).value('.','varchar(max)')
    ,1,2, ''),'')AS YourMultiValueCustomFieldNameValues
FROM    MSP_EpmProject_UserView 
GROUP BY ProjectUID ) MVList
ON MSP_EpmProject_UserView.ProjectUID = MVList.ProjectUID

The Output

The output will yield a comma delimited list of values in the last column of the dataset. If you need that comma delimited list sorted, add an ORDER BY MemberValue statement right before the GROUP BY ProjectUID) MVList statement.

Database Diagrams–Project Server Reporting Database

image

These high level entity relationship diagrams were first published in the deck for my Project Conference Hands On Lab deck. I’ve had a number of requests for this information so here it is.

These diagrams are based on the 2010 RDB but the 2013 RDB should not be materially different. For 2007 users, many of these same entities also exist in the 2007 RDB as well. The UID fields are the keys used to join these entities together.

The name of the entity also contains the recommended table or view name to be used in the Reporting database. Any entity ending with _UserView will automatically include custom fields for that entity. Multi-value custom fields are not included and require special querying to retrieve (more on that later).

If you are tying different data entities together, you should also consider using the lowest level of detail for a given data element. For example, if you are querying Project – Task – Assignments – Resources, I would use AssignmentWork rather than TaskWork or ProjectWork as AssignmentWork will aggregate correctly in PivotTables. Otherwise, you will get a multiple of ProjectWork or TaskWork, depending on the number of records retrieved.

Click on the graphic below to make it bigger.

Project Server Reporting Database Entity Relationship Diagram

PSERD1

Project Server Reporting Database Timesheet Entity Relationship Diagram

PSERD2

Happy querying!