Friday, June 16, 2023

Master Data Management (MDM)

 

Master Data Management (MDM) refers to a set of processes, tools, and strategies that aim to create and maintain a single, consistent, and authoritative source of master data within an organization. Master data refers to the core business entities and attributes that are shared across multiple systems and applications, such as customers, products, vendors, locations, or employees.

The primary goal of MDM is to ensure data consistency, accuracy, and integrity across different systems, departments, and business processes. It involves defining and managing the key entities and attributes that are critical for the organization's operations, enabling a unified and reliable view of master data.

Here are key steps of master data management:

1.     Data Integration: MDM focuses on integrating master data from various sources, systems, and applications into a centralized repository or database. It consolidates data from disparate systems, resolving data conflicts, and inconsistencies to create a single, trusted version of master data.


2.     Data Governance: MDM encompasses data governance practices and policies to establish rules, standards, and processes for managing master data. It ensures that data ownership, accountability, and data quality responsibilities are defined and enforced throughout the organization.


3.     Data Quality Management: MDM includes data quality processes and tools to improve the accuracy, completeness, and consistency of master data. It helps identify and resolve data quality issues, such as duplicates, missing values, or incorrect entries, ensuring that high-quality data is available for decision-making and operational processes.


4.     Data Harmonization and Standardization: MDM facilitates the harmonization and standardization of master data across different systems and applications. It establishes consistent data formats, naming conventions, and data structures to ensure compatibility and interoperability across the organization.


5.     Data Access and Sharing: MDM provides mechanisms for controlled data access and sharing. It allows authorized users and systems to access and retrieve master data based on defined permissions and security controls. This enables data consistency and enables efficient data sharing across departments and systems.


6.     Data Lifecycle Management: MDM addresses the entire lifecycle of master data, including creation, update, archival, and retirement. It ensures that master data is properly managed throughout its lifecycle, supporting compliance with regulatory requirements and business policies.

7.     Data Mastering and Golden Record: MDM identifies the most accurate and reliable version of master data known as the "golden record." It employs data mastering techniques to consolidate and merge data from various sources, resolving conflicts and creating a single, authoritative version of each master data entity.


8.     Data Relationship Management: MDM recognizes and manages the relationships and hierarchies between different master data entities. For example, it establishes relationships between customers and their associated orders, products and their categories, or employees and their organizational units. This enables a holistic view and analysis of interconnected data.


Benefits of MDM

By implementing MDM, organizations can improve data quality, reduce data inconsistencies and redundancies, enhance decision-making, and achieve a unified view of critical business entities. It helps organizations maintain accurate and consistent master data, enabling better operational efficiency, customer service, and strategic insights.

Data Governance

Data Governance


Data governance is a set of processes, policies, and guidelines that ensure the effective and responsible management of an organization's data assets. It involves establishing a framework to define how data is collected, stored, accessed, and used within an organization. 

 The primary goal of data governance is to ensure the quality, availability, integrity, and security of data across the entire data lifecycle. It provides a structure for making data-related decisions, resolving conflicts, and aligning data management practices with organizational goals and regulatory requirements. 

Following are the Key components of data governance: 

 1. Data Stewardship: Identifying data stewards who are responsible for managing specific data domains or sets, ensuring data quality, and enforcing data policies. 

 2. Data Policies: Developing and implementing policies and guidelines that define how data should be managed, including data privacy, security, retention, and usage rules. 

3. Data Standards: Establishing standards for data naming conventions, formats, definitions, and classifications to ensure consistency and interoperability. 

4. Data Quality Management: Defining processes and mechanisms to measure, monitor, and improve the quality of data, including data validation, cleansing, and remediation. 

5. Data Access and Security: Implementing controls and mechanisms to regulate data access, protect sensitive information, and ensure compliance with privacy and security regulations. 

6. Data Lifecycle Management: Defining procedures for data creation, capture, storage, retention, archival, and disposal, ensuring adherence to legal and regulatory requirements. 

7. Data Governance Council: Forming a cross-functional governance body or council that includes representatives from various business units to provide oversight, establish policies, and resolve data-related issues. 

Following are the benefits of data governance practices: 

1. Improved Data Quality: Data governance helps maintain data accuracy, consistency, and reliability, enabling better decision-making and reducing errors and inconsistencies. 

2. Enhanced Compliance: Data governance ensures adherence to regulatory requirements, data privacy laws, and industry standards, reducing the risk of penalties and legal issues. 

3. Increased Data Transparency: Clear data governance processes promote transparency and accountability, allowing stakeholders to understand data sources, definitions, and usage. 

4. Better Data Integration and Collaboration: Consistent data standards and policies facilitate data sharing, integration, and collaboration across different departments and systems. 

5. Effective Risk Management: Data governance enables organizations to identify and mitigate data-related risks, such as data breaches, unauthorized access, and data loss. Overall, data governance provides a framework for organizations to manage their data as a valuable strategic asset, ensuring its integrity, security, and optimal utilization for business purposes. 

  Governance Tools 
There are several data governance tools available in the market that can help organizations implement and manage their data governance initiatives. These tools offer features and functionalities to support data quality management, metadata management, data lineage tracking, compliance monitoring, and collaboration among data stewards. Here are some commonly used data governance tools: 

 1. Collibra: Collibra is a popular data governance platform that provides a comprehensive set of capabilities for data governance, including data cataloging, data lineage, data stewardship, and policy management. It offers a centralized repository for managing data assets and supports collaboration among data stakeholders. 

2. Informatica Axon: Informatica Axon is a data governance tool that focuses on data cataloging, metadata management, and data stewardship. It provides a business-friendly interface to document and manage data assets, define data policies, and track data lineage. 

3. IBM InfoSphere Information Governance Catalog: IBM's Information Governance Catalog is a data governance and metadata management tool that helps organizations govern and manage their data assets. It offers features such as data cataloging, metadata discovery, data lineage, and data quality management. 

4. Alation: Alation is a data catalog and governance platform that combines data cataloging, data stewardship, and collaboration features. It enables users to discover, understand, and govern data assets while promoting collaboration and knowledge sharing among data stakeholders. 

5. SAP Data Governance: SAP Data Governance is a solution that provides capabilities for data quality management, metadata management, and data lifecycle management. It integrates with SAP's broader data management suite and offers features such as data profiling, data cleansing, and data lineage tracking. 

6. Talend Data Fabric: Talend Data Fabric is a comprehensive data integration and management platform that includes data governance capabilities. It offers features like data cataloging, data quality management, metadata management, and data lineage. 

7. Ataccama ONE: Ataccama ONE is a data governance and master data management platform that provides features for data quality, data profiling, metadata management, and data lineage. It supports collaborative workflows and offers integration with other data management tools.

Tuesday, January 11, 2011

###001 SQL Script to return user accounts and the roles they are assigned to.....

-- Process
-- Create Temp Table for Users
-- Create Temp Table for Roles
-- Populate Users
-- Populate Roles
-- Iterate though each user and update their roles into a single column
-- Return the users and their roles
Create Table #Temp_Users
(
DatabaseUserName varchar(128),
SYSLoginname varchar(128),
CreateDate datetime,
LastModifiedDate datetime,
LoginType varchar(50),
Roles varchar(1024)
)

Create Table #Temp_Roles
(
Name varchar(128),
Role varchar(128)
)

insert into #Temp_Users
select a.name,
b.Loginname,
[Create Date] = a.CreateDate,
[Last Modified Date] = a.UpdateDate,
LoginType = case
when a.IsNTName = 1 then 'Windows Account'
when a.IsNTGroup = 1 then 'Windows Group'
when a.isSqlUser = 1 then 'SQL Server User'
when a.isAliased =1 then 'Aliased'
when a.isSQLRole = 1 then 'SQL Role'
when a.isAppRole = 1 then 'Application Role'
else 'Unknown'
end,
Roles = ''
from sysusers a
left outer join master..syslogins b on a.sid=b.sid
where a.SID is not null
order by a.Name

insert into #Temp_Roles
select MemberName = u.name, DbRole = g.name
from sysusers u, sysusers g, sysmembers m
where g.uid = m.groupuid
and g.issqlrole = 1
and u.uid = m.memberuid
order by 1, 2



Declare @Name varchar(128)
Declare @Roles varchar(1024)
Declare @Role varchar(128)

DECLARE UserCursor CURSOR for
SELECT DatabaseUserName from #Temp_Users
OPEN UserCursor
FETCH NEXT FROM UserCursor into @Name
WHILE @@FETCH_STATUS = 0
BEGIN
set @Roles = ''
print @Name
DECLARE RoleCursor CURSOR for
SELECT Role from #Temp_Roles where Name = @Name

OPEN RoleCursor
FETCH NEXT FROM RoleCursor into @Role
WHILE @@FETCH_STATUS = 0
BEGIN
if (@Roles > '')
set @Roles = @Roles + ', '+@Role
else
set @Roles = @Role
FETCH NEXT FROM RoleCursor into @Role
end
Close RoleCursor
DEALLOCATE RoleCursor
Update #Temp_Users set Roles = @Roles where DatabaseUserName = @Name
FETCH NEXT FROM UserCursor into @Name
END
CLOSE UserCursor
DEALLOCATE UserCursor

update #Temp_Users
set Roles='public'
where Roles=''

select * from #Temp_Users

drop table #Temp_Users
drop table #Temp_Roles

Wednesday, June 4, 2008

Scheduling the SSIS package which is using SQL Server Authentication to connect to Database

Dear Friends,

There is only a small change required to schedule the SSIS package which is using SQL Server authentication to connect to Database other than Windows authentication (Go through the blog Schedule SSIS through SQL Server Management Studio... to know how to schedule the SSIS package)

Go to step properties (Means Add step or edit step window) -- After selecting the package ,Click on Data sources Tab. There we can see all the connections of SSIS package which we selected. The make the check box "True" for that connection string which is using SQL Server authentication. Then edit the connection string by inserting the SQL server User's password after the User ID

For Eg: Data Source=Databaseservername;User ID=test;pwd=tstpwd;Initial Catalog=testdb;Provider=SQLOLEDB;Persist Security Info=True;

For more clarification refer the following image

Wednesday, March 26, 2008

Schedule SSIS through SQL Server Management Studio (SQL Server 2005)


Scheduling an SSIS Package is an easy job for a DBA.
1. Select New Job from the pop-up menu. (SQL Server Agent --> Jobs --> New Job)



2. Specify a new for the New Job in the General page. Click on New on the Step page, specify the step name, Select SQL Server Integration services in the Type combo box, Select the File System as Package Source and Select package from the path where the package is stored.



3. Click on the Datasource tab, there we could see all the datasources which we specified in the SSIS package. Check mark on those Datasources and verify those attributes specified in the connection string.


4. There is one more interesting option, we can write the log files into files (Check the logging tab)

after filling all these click OK.

5. To create a schedule for this job go to schedule page and create a Schedule as per our requirement.


I hope this blog will help you to create a scheduled Job For SSIS packages. Please let me know if you have any questions

Tuesday, March 25, 2008

Migrate DTS to SSIS

Migrating DTS packages from SQL Server 2000 to SQL Server 2005 is an interesting work.

SSIS is a flexible work environment and it is a very good ETL tool. Those who are interested in Datawarehousing technology OR working in Datawarehousing technology, can try this tool. which is a very user friendly ETL tool

Please follow the following steps to Migrate DTS packages.

1. Login to Business Intelligence Studio

































2. Create a new SSIS package (Or open an existing SSIS package to Migrate the SSIS package)



3. Click mouse right button on SSIS Package folder from the Solution explorer and select "Migrate DTS 2000 Package" from the Pop Up menu OR Select "Migrate DTS 2000 Package" from Project menu




4. Then the Migration Wizard will open click Next







5. Select the source database, From where we need to migrate DTS Packages then Click next.







6. Choose the destination folder then Click next.








7. Select those DTS Packages, which we need to migrate to SSIS (SQL 2005)





8. Select the Logfile location. The migration logfile will have all the information related to DTS Migration. Even if there are some errors the logfile will have the reason for failing.






9. Confirm the information we had given to process like SQL 2000 Server name and SSIS package location then click Finish to start the migration Process.







10. Migration process






Click the close button once the process has completed.



The Migrated DTS packages will be there in the SSIS Foleder under Solution Explorer.





11. Now Open the Migrated package by double clicking on the package name OR choose Open from the popup menu











12. Verify the connection of the exisitng package. Sometimes there will connections to test files or excel files, in those cases we need to manually copy those files into the specific locations and specify those paths in the the connection strings. In the case of Database connections change the Server names/Userids/passwords then verify the connection.









13. k for the errors, Warnigs and messages from the below tab. If there are errors clear those before building the package.




14. Build the package. (Now the SSIS package is ready to use)




I hope this iformations will help you to migrate the DTS packages. Also let me know if you have any questions or comments.

Have a Great Day.......