search for books and compare prices
Tables of Contents for Knowledge Discovery and Data Mining
Chapter/Section Title
Page #
Page Count
Preface
xiii
 
Contributors
xvii
 
Part I KNOWLEDGE DISCOVERY AND DATA MINING IN THEORY
1
136
Estimating concept difficulty with cross entropy
3
29
K. Nazar
M.A. Bramer
Introduction
3
1
Attributes and examples
4
1
The need for bias in machine learning
4
3
Bias interference
5
2
Concept dispersion and feature interaction
7
1
Why do we need to estimate concept difficulty?
8
2
Data-based difficulty measures
10
4
μ-ness
10
1
Variation
11
1
Blurring
11
3
Why use the J measure as a basis for a blurring measure?
14
1
Effects of various data characteristics on blurring and variation
15
4
Irrelevant attributes
17
2
Dataset analysis
19
2
Problems with information-theoretic-based blurring measures
21
1
Estimating attribute relevance with the RELIEF algorithm
22
4
Experiments with RELIEFF
23
3
Blurring over disjoint subsets
26
1
Results and discussion
27
1
Summary and future research
28
2
References
30
2
Analysing outliers by searching for plausible hypotheses
32
14
X. Liu
G. Cheng
Introduction
32
1
Statistical treatment of outliers
33
2
An algorithm for outlier analysis
35
3
Experimental results
38
3
Case I: onchocerciasis
38
1
Case II: glaucoma
39
2
Evaluation
41
3
Concluding remarks
44
1
Acknowledgments
44
1
References
44
2
Attribute-value distribution as a technique for increasing the efficiency of data mining. D. McSherry
46
18
Introduction
46
2
Targeting a restricted class of rules
48
2
Discovery effort and yield
50
3
Attribute-value distribution
53
2
Experimental results
55
7
The contact-lens data
55
5
The project-outcome dataset
60
2
Discussion and conclusions
62
1
References
63
1
Using background knowledge with attribute-oriented data mining
64
23
M. Shapcott
S. McClean
B. Scotney
Introduction
64
2
Partial value model
66
5
Definition: partial value
66
1
Definition: partial-value relation
66
1
Example
66
1
Aggregation
67
1
Definition: simple count operator
67
1
Example
68
1
Definition: simple aggregate operator
68
1
Example
68
1
Aggregation of partial values
69
1
Definition: partial value aggregate operator
69
1
Example
69
1
Definition: partial-value count operator
70
1
Theoretical framework
70
1
Reengineering the database-the role of background knowledge
71
5
Concept hierarchies
71
2
Database-integrity constraints
73
1
Definition: simple comparison predicate
73
1
Examples of simple comparison predicates
74
1
Definition: table-based predicate
74
1
Example of a table-based predicate
74
1
Expression of rules as table-based predicates
74
1
Reengineering the database
74
2
Multiattribute count operators
76
6
Example
77
1
Example
77
1
Example
78
1
Quasi-independence
79
2
Example
81
1
Interestingness
81
1
Example
82
1
Related work
82
1
Conclusions
83
1
Acknowledgments
84
1
References
84
3
A development framework for temporal data mining
87
27
X. Chen
I. Petrounias
Introduction
87
2
Analysis and representation of temporal features
89
6
Time domain
89
1
Calendar expression of time
90
2
Periodicity of time
92
2
Time dimensions in temporal databases
94
1
Potential knowledge and temporal data mining problems
95
5
Forms of potential temporal knowledge
95
2
Associating knowledge with temporal features
97
1
Temporal mining problems
98
2
A framework for temporal data mining
100
4
A temporal mining language
100
2
System architecture
102
2
An example: discovery of temporal association rules
104
6
Mining problem
104
2
Description of mining tasks in TQML
106
1
Search algorithms
107
3
Conclusion and future research direction
110
1
References
110
4
An integrated architecture for OLAP and data mining. Z. Chen
114
23
Introduction
115
1
Preliminaries
116
3
Decision-support queries
116
1
Data warehousing
116
1
Basics of OLAP
117
1
Star schema
118
1
A materialised view for sales profit
118
1
Differences between OLAP and data mining
119
4
Basic concepts of data mining
119
1
Different types of query can be answered at different levels
120
1
Aggregation semantics
120
2
Sensitivity analysis
122
1
Different assumptions or heuristics may be needed at different levels
123
1
Combining OLAP and data mining: the feedback sandwich model
123
3
Two different ways of combining OLAP and data mining
124
1
The feedback sandwich model
125
1
Towards integrated architecture for combined OLAP/data mining
126
2
Three specific issues
128
7
On the use and reuse of intensional historical data
128
3
How data mining can benefit OLAP
131
2
OLAP-enriched data mining
133
2
Conclusion
135
1
References
135
2
Part II KNOWLEDGE DISCOVERY AND DATA MINING IN PRACTICE
137
167
Empirical studies of the knowledge discovery approach to health-information analysis
139
21
M. Lloyd-Williams
Introduction
139
1
Knowledge discovery and data mining
140
9
The knowledge discovery process
140
7
Artificial neural networks
147
2
Empirical studies
149
8
The `Health for all' database
149
4
The `Babies at risk of intrapartum asphyxia' database
153
2
Infertility databases
155
2
Conclusions
157
1
References
158
2
Direct knowledge discovery and interpretation from a multilayer perceptron network which performs low-back-pain classification
160
20
M.L. Vaughn
S.J. Cavill
S.J. Taylor
M.A. Foy
A.J.B. Fogg
Introduction
160
1
The MLP network
161
2
The low-back-pain MLP network
163
1
The interpretation and knowledge-discovery method
163
5
Discovery of the feature-detector neurons
163
1
Discovery of the significant inputs
164
1
Discovery of the negated significant inputs
164
2
Knowledge learned by the MLP from the training data
166
1
MLP network validation and verification
167
1
Knowledge discovery from LBP example training cases
168
5
Discovery of the feature detectors for example training cases
168
1
Discovery of the significant inputs for example training cases
168
1
Data relationships and explanations for example training cases
168
2
Discussion of the training-example data relationships
170
1
Induced rules from training-example cases
170
3
Knowledge discovery from all LBP MLP training cases
173
3
Discussion of the class key input rankings
173
3
Validation of the LBP LMP network
176
1
Validation of the training cases
176
1
Validation of the test cases
176
1
Conclusions
177
1
Future work
177
1
Acknowledgments
178
1
References
178
2
Discovering knowledge from low-quality meteorological databases
180
24
C.M. Howard
Rayward-Smith
Introduction
180
2
The meteorological domain
180
2
The preprocessing stage
182
6
Visualisation
182
1
Missing values
182
3
Unreliable data
185
1
Discretisation
186
1
Feature selection
187
1
Feature construction
188
1
The data-mining stage
188
7
Simulated annealing
189
1
SA with missing and unreliable data
190
5
A toolkit for knowledge discovery
195
1
Results and analysis
196
3
Results from simulated annealing
198
1
Results from C5.0
198
1
Comparison and evaluation of results
199
1
Summary
199
1
Discussion and further work
200
1
References
201
3
A meteorological knowledge-discovery environment
204
23
A.G. Buchner
J.C.L. Chan
S.L.Hung
J.G. Hughes
Introduction
204
1
Some meteorological background
205
6
Available data sources
206
2
Related work
208
3
MADAME's architecture
211
1
Building a meteorological data warehouse
212
5
The design
212
1
Information extraction
213
1
Data cleansing
214
1
Data processing
215
1
Data loading and refreshing
216
1
The knowledge-discovery components
217
5
Knowledge modelling
217
4
Domain knowledge
221
1
Prediction trial runs
222
2
Nowcasting of heavy rainfall
222
1
Landslide nowcasting
223
1
Conclusions and further work
224
1
Acknowledgments
224
1
References
225
2
Mining the organic compound jungle-a functional programming approach
227
13
K.E. Burn-Thornton
J. Bradshaw
Introduction
227
1
Decision-support requirements in the pharmaceutical industry
227
3
Graphical comparison
228
1
Structural keys
228
1
Fingerprints
229
1
Variable-sized fingerprints
230
1
Functional programming language Gofer
230
4
Functional programming
231
3
Design of prototype tool and main functions
234
2
Design of tool
234
1
Main functions
235
1
Methodology
236
1
Results
236
2
Sets A, B and C (256, 512 and 1024 bytes)
237
1
Set D (2048 bytes)
237
1
Conclusions
238
1
Future work
239
1
References
239
1
Data mining with neural networks-an applied example in understanding electricity consumption patterns
240
64
P. Brierley
B. Batty
Neural networks
241
9
What are neural networks?
241
1
Why use neural networks?
242
1
How do neural networks process information?
242
2
Things to be aware of...
244
6
Electric load modelling
250
5
The data being mined
250
2
Why forecast electricity demand?
252
1
Previous work
253
1
Network used
254
1
Total daily load model
255
11
Overfitting and generalisation
264
2
Rule extraction
266
8
Day of the week
267
1
Time of year
268
1
Growth
269
2
Weather factors
271
1
Holidays
272
2
Model comparisons
274
2
Half-hourly model
276
12
Initial input data
276
1
Results
277
9
Past loads
286
1
How the model is working
286
1
Extracting the growth
287
1
Populations of models
288
1
Summary
288
1
References
289
3
Appendixes
292
12
Backpropagation weight update rule
292
7
Fortran 90 code for a multilayer perceptron
299
5
Index
304