Product Information Management (PIM) As a Stepping Stone

PIM As a Stepping Stone


Scott Cairney, VP Product Management, Cactus Commerce

Dominic Citino, Retail Solution Specialist, Microsoft

For those of you who are not familiar with PIM, let us apologize in advance for cluttering your mind with yet another acronym. Product Information Management, or PIM, refers to the systems, strategies, and processes to manage product data. Although the concepts of PIM are certainly nothing new, retailers are now addressing their PIM needs in the context of their multichannel strategy. The nature of what PIM provides means that many retailers see PIM as a precursor or dependency in their broader E-commerce solution initiatives. It is generally understood that analysts are predicting a huge E-commerce platform refresh over the next 1–3 years. If that information is accurate, we suggest that there will also be a large focus on PIM solutions, too.

We must deconstruct PIM a bit to understand why retailers are finding it so critical to their multichannel capabilities. PIM, as a concept, seeks to enable true item management outside the dependent application environments. Go back in time to the era of the merchandising system/enterprise resource planning (ERP) boom. Retailers were buying or building big merchandising applications and/or ERP systems that focused on providing “foundation data” to run their enterprises. These enterprises were almost always focused on the store and warehouse environment. Every merchandising application has some form of “item master.” This data store typically contains a big list of fields and in some cases some basic hierarchies. Retailers typically use some basic editor screens to manage and manipulate this data along with any automated data feeds that they have. The data in the item master is the core of a retailer’s foundation data.

As application environments became more complex, retailers needed to disseminate this item data to many other systems and tools. Worse yet, the functional needs of each system often varied greatly. For example, an execution system for a supply chain, such as a warehouse management system (WMS), requires a vast array of handling rules and dimensions to which merchandising applications are often blind. So the retailer had to decide whether to build these new fields (and the ability to maintain them) into the merchandising application, or build application-specific item attributes that remained in the application that required them. Multiply this scenario exponentially across application environments and functional areas, and retailers ended up with a mess.

The final nail in the coffin of the application-centric item data store was the emergence of multichannel retailing. Retailers struggled to syndicate item information with core merchandising applications in a channel-specific manner. Each channel might require different versions of item information. For example, the Web might require rich product descriptions and images, whereas logistics applications might require dimensions and other physical data. A shopping affiliate partner might require an XML feed that was limited to certain fields that it supported.

A PIM solution promises to extract processes for the aggregation, maintenance, and syndication of product information from legacy applications including ERP and core merchandising products. The vision here is a nimble architecture that enables interaction with product information in a manner that is not dependent on a specific application environment. The PIM can be a single source of truth for item data and a clean separation from other business rules. From a multichannel perspective, PIM enables retailers to maintain and syndicate product information to their sales channels in a much more agile fashion so that the process is better tailored to the requirements of the channel.

There are a couple of views of PIM that are prevalent across the software industry. Some view PIM as a software application. In other words, PIM functionality in a PIM solution will address the challenges of item management. Others view PIM as a strategy that tackles item management by using tools and processes. Although both views have some validity, it is important for a retailer to examine internal discipline and processes to understand which direction is best for the organization.

Given the promises of PIM, many retailers are looking to “get their house in order” by using a PIM implementation, either as a first step toward a refresh of their E-commerce platform or as part of that refresh.

A PIM solution must make all product information available from a central point in a reliable, meaningful, and timely way to people, processes, and applications that rely on it. Product data in most organizations is scattered throughout business systems, trading-partner networks, and supply chain networks, in addition to residing in the minds of the people who run the organization. Much of this scattered data is scattered for good reasons: because best-of-breed systems are relying on it and because systems that address differing needs often have overlapping data requirements. The question is therefore how we can organize this situation to work for us as we strive to meet our business challenges. Figure 1 proposes a PIM solution that is based on two things: a service-oriented architecture (SOA) that has controlling processes, and a cache that enables low latency and “always-on” reliable access to the data.


Figure 1. Functional architecture to support PIM in multichannel environment

Figure 1 highlights the components that are necessary for the PIM implementation. Around the outside are the consumers and providers of the data. These are systems such as ERP, warehouse management, merchandising, and supply chains. These entities provide product data, consume data, or bi-directionally share data with other systems or processes within the organization. By organizing the integration of these systems with PIM control data and business rules it is possible to identify various systems as owners or consumers of data.

The PIM cache is strategically located within the multichannel retail (MCR) data repository for several reasons. The main reason is that E-commerce is the primary low-latency consumer of product data but there are other reasons including advantages such as fault-tolerant reliability and malleable storage structures.

The final aspect to this conceptual approach is the human workflow, which you enable by using collaboration tools. You can consider this interface as the window into the PIM system because it is where individuals interact with product data. This experience is connected directly to the business rules as human interaction. Unlike system-to-system integration, this interaction must resemble real time and yet support all of the same business rules and routing. You must enforce these rules and routing to ensure complete data integrity at all times.

You must begin the implementation of your PIM solution, regardless of scope, by identifying and cataloging all product data sources. It is only after you have thoroughly cataloged product data throughout the organization that you can begin the process of organizing the data. You should note that often, a lot of data resides in the minds of employees in addition to sources outside the organization. You must consider this data in any successful PIM implementation.

After you have cataloged the data, you can assign ownership, synchronization rules, and identification schemes for all of the data. The PIM control database stores this metadata. In parallel, you can initiate an integration program because you will require connectivity to the data sources. Some data sources will be simple providers of data, for example, an ERP system that does not support external updates to its data, or a vendor who provides product data. Other systems will permit synchronization. The PIM business rules will either need to enforce the rules of synchronization, or incorporate the rules of synchronization into the PIM business rules. (Integration involves more than tying data endpoints; it also involves the consumption of business processes and rules.)

Now that you have implemented connectivity to all sources, you must account for all of the data that is nonexistent in the various sources. Often, this is a significant amount of high-value information that is used to sell the product. This data, along with all of the other data that is required in highly scalable, low-latency, or highly reliable ways, is cached in the MCR data repository. You use this repository to cache and persist unstructured or customer-facing data because of the repository’s ability to support high scale and reliability. In addition, the customer-facing application is the primary requester of real-time or low-latency demands and is also the natural repository for intentioned information.

When you have connections and control in place, you can source data centrally by issuing a request to the service-oriented infrastructure. The infrastructure opens the request, obtains source information from the control data, obtains the data through the various connections to the line-of-business systems, and returns the data in a consistent fashion based on the contract that you specified. If a source is not available, business rules may elect to reply with cached data from the PIM cache to provide “always-on” information.

Architecturally, you could build this solution by using the technologies that are shown in Figure 2.


Figure 2. Functional architecture to support PIM with Microsoft Technologies

In Figure 2, you can see that Microsoft Commerce Server 2007 provides the core to the multichannel foundation. Various items can be built on this foundation, such as an E-commerce Web site, data that is provided to point-of-sale (POS) systems, and business-to-business (B2B) E-commerce or commerce over nontraditional channels such as mobile channels, gaming channels, and so on. Central, common data is a key enabler to multichannel initiatives.

Figure 2 also shows that the Microsoft SQL Server database software supports Commerce Server 2007. SQL Server not only manages the data but also provides it in a highly reliable and scalable manner. These are native capabilities in SQL Server.

On the left of Figure 2, you can see that Microsoft Office SharePoint Server hosts the Marketing and Merchandising management experiences. This is because Office SharePoint Server is a natural platform for the implementation of information-worker tools. It provides the capabilities that are required for multiple people to collaborate on a given set of data, in this case product information. Office SharePoint Server provides the host for the knowledge-worker experience. Office SharePoint Server supports capabilities such as authentication, security, roles, workflow, presentation, and worklists, which provides a stable platform to enable a consistent business user experience.

Finally, the support for this entire system is the SOA, which Microsoft BizTalk Server implements. BizTalk Server and its messaging capabilities support out-of-the-box connection to many line-of-business systems and data repositories, in addition to the framework for the simple creation of custom connections. Routing intelligently enables you to configure publishers and subscribers of the data. The business process and rules capabilities support the implementation of logic that coordinates the ownership of data, exception processing, and the transactional nature of the platform. These features ensure that long-running and short-running transactions are enforced and compensated as required.

Please see below for MSDN resources about the technologies that this article mentions.

Commerce Server:

BizTalk Server:

Office SharePoint Server:

SQL Server:

Service-oriented architecture (SOA)