search for books and compare prices
Tables of Contents for Data Munging With Perl
Chapter/Section Title
Page #
Page Count
foreword
xi
 
preface
xiii
 
about the cover illustration
xviii
 
PART I FOUNDATIONS
1
78
Data, data munging, and Perl
3
15
What is data munging?
4
3
Data munging processes
4
1
Data recognition
5
1
Data parsing
6
1
Data filtering
6
1
Data transformation
6
1
Why is data munging important?
7
2
Accessing corporate data repositories
7
1
Transferring data between multiple systems
7
1
Real-world data munging examples
8
1
Where does data come from? Where does it go?
9
3
Data files
9
1
Databases
10
1
Data pipes
11
1
Other sources/sinks
11
1
What forms does data take?
12
2
Unstructured data
12
1
Record-oriented data
13
1
Hierarchical data
13
1
Binary data
13
1
What is Perl?
14
2
Getting Perl
15
1
Why is Perl good for data munging?
16
1
Further information
17
1
Summary
17
1
General munging-practices
18
21
Decouple input, munging, and output processes
19
1
Design data structures carefully
20
5
Example: the CD file revisited
20
5
Encapsulate business rules
25
6
Reasons to encapsulate business rules
26
1
Ways to encapsulate business rules
26
1
Simple module
27
1
Object class
28
3
Use UNIX ``filter'' model
31
5
Overview of the filter model
31
1
Advantages of the filter model
32
4
Write audit trails
36
2
What to write to an audit trail
36
1
Sample audit trail
37
1
Using the UNIX system logs
37
1
Further information
38
1
Summary
38
1
Useful Perl idioms
39
18
Sorting
40
7
Simple sorts
40
1
Complex sorts
41
1
The Orcish Manoeuvre
42
1
Schwartzian transform
43
3
The Guttman-Rosler transform
46
1
Choosing a sort technique
46
1
Database Interface (DBI)
47
2
Sample DBI program
47
2
Data::Dumper
49
2
Benchmarking
51
2
Command line scripts
53
2
Further information
55
1
Summary
56
1
Pattern matching
57
22
String handling functions
58
2
Substrings
58
1
Finding strings within strings (index and rindex)
59
1
Case transformations
60
1
Regular expressions
60
17
What are regular expressions?
60
1
Regular expression syntax
61
4
Using regular expressions
65
5
Example: translating from English to American
70
3
More examples: /etc/passwd
73
3
Taking it to extremes
76
1
Further information
77
1
Summary
78
1
PART II DATA MUNGING
79
68
Unstructured data
81
15
ASCII text files
82
5
Reading the file
82
2
Text transformations
84
1
Text statistics
85
2
Data conversions
87
7
Converting the character set
87
1
Converting line endings
88
2
Converting number formats
90
4
Further information
94
1
Summary
95
1
Record-oriented data
96
31
Simple record-oriented data
97
11
Reading simple record-oriented data
97
3
Processing simple record-oriented data
100
2
Writing simple record-oriented data
102
3
Caching data
105
3
Comma-separated files
108
2
Anatomy of CSV data
108
1
Text::CSV_XS
109
1
Complex records
110
4
Example: a different CD file
111
2
Special values for $/
113
1
Special problems with date fields
114
9
Built-in Perl date functions
114
6
Date::Calc
120
1
Date::Manip
121
1
Choosing between date modules
122
1
Extended example: web access logs
123
3
Further information
126
1
Summary
126
1
Fixed-width and binary data
127
20
Fixed-width data
128
11
Reading fixed-width data
128
7
Writing fixed-width data
135
4
Binary data
139
5
Reading PNG files
140
3
Reading and writing MP3 files
143
1
Further information
144
1
Summary
145
2
PART III SIMPLE DATA PARSING
147
78
Complex data formats
149
14
Complex data files
150
4
Example: metadata in the CD file
150
2
Example: reading the expanded CD file
152
2
How not to parse HTML
154
4
Removing tags from HTML
154
3
Limitations of regular expressions
157
1
Parsers
158
4
An introduction to parsers
158
3
Parsers in Perl
161
1
Further information
162
1
Summary
162
1
HTML
163
12
Extracting HTML data from the World Wide Web
164
1
Parsing HTML
165
2
Example: simple HTML parsing
165
2
Prebuilt HTML parsers
167
5
HTML::LinkExtor
167
2
HTML::TokeParser
169
2
HTML::TreeBuilder and HTML::Element
171
1
Extended example: getting weather forecasts
172
2
Further information
174
1
Summary
174
1
XML
175
34
XML overview
176
2
What's wrong with HTML?
176
1
What is XML?
176
2
Parsing XML with XML::Parser
178
13
Example: parsing weather.xml
178
1
Using XML::Parser
179
2
Other XML::Parser styles
181
7
XML::Parser handlers
188
3
XML::DOM
191
2
Example: parsing XML using XML::DOM
191
2
Specialized parsers--XML::RSS
193
4
What is RSS?
193
1
A sample RSS file
193
2
Example: creating an RSS file with XML::RSS
195
1
Example: parsing an RSS file with XML::RSS
196
1
Producing different document formats
197
11
Sample XML input file
197
1
XML document transformation script
198
7
Using the XML document transformation script
205
3
Further information
208
1
Summary
208
1
Building your own parsers
209
16
Introduction to Parse::RecDescent
210
2
Example: parsing simple English sentences
210
2
Returning parsed data
212
5
Example: parsing a Windows INI file
212
1
Understanding the INI file grammar
213
1
Parser actions and the @item array
214
1
Example: displaying the contents of @item
214
2
Returning a data structure
216
1
Another example: the CD data file
217
6
Understanding the CD grammar
218
1
Testing the CD file grammar
219
1
Adding parser actions
220
3
Other features of Parse::RecDescent
223
1
Further information
224
1
Summary
224
1
PART IV THE BIG PICTURE
225
7
Looking back--and ahead
227
5
The usefulness of things
228
1
The usefulness of data munging
228
1
The usefulness of Perl
228
1
The usefulness of the Perl community
229
1
Things to know
229
3
Know your data
229
1
Know your tools
230
1
Know where to go for more information
230
2
appendix A Modules reference
232
22
appendix B Essential Perl
254
19
index
273