Datastage高级培训课程
课程大纲:
(1). Parallel Framework Architecture
1. parallel processing architecture
2. pipeline and partition parallelism
3. role of configuration file
(2). job's compile and execution
1. main part of configuration file
2. compile process and OSH generated
3. role and part of Score
(3). partitioning and collectioning
1. how partitioning works in Framework
2. select collectioning and partitioning algorithms
3. view collector and partitioners in Score
(4). Sorting
1. soring data in parallel framework
2. reduce the number of inserted sorts in scores
3. optimize the Fork-join jobs
(5). Buffer
1. how buffer works
2. buffer tuning
3. avoid buffer contentions
(6). Data types
1. virtual data set
2. schemas
3. handle nulls
4. complex data etc.
(7). Reusable components
1. schema file and use it read sequential file
2. Runtime Column Progation (RCP)
3. shared containers
(8). Transformer Logic
1. null handling
2. Loop processing
3. groups
(9). Extend the datastage function
1. Wrapped stages
2. Build stages
3. External Function routines
4. Custom stages
(10). best practice
1. stage usage including lookup,Aggregator, transformaer etc...
2. performance tuning