skip to main content
10.5555/1273808dlproceedingsBook PagePublication PagesausdmConference Proceedingsconference-collections
AusDM '06: Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
2006 Proceeding
Publisher:
  • Australian Computer Society, Inc.
  • P.O. Box 319 Darlinghurst, NSW 2010
  • Australia
Conference:
Sydney Australia November 29 - 30, 2006
ISBN:
978-1-920682-41-5
Published:
01 November 2006

Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
Article
Free
Safely delegating data mining tasks
pp 1–7

Data mining is playing an important role in decision making for business activities and governmental administration. Since many organizations or their divisions do not possess the in-house expertise and infrastructure for data mining, it is beneficial ...

Article
Free
Data mining methodological weaknesses and suggested fixes
pp 9–16

Predictive accuracy claims should give explicit descriptions of the steps followed, with access to the code used. This allows referees and readers to check for common traps, and to repeat the same steps on other data. Feature selection and/or model ...

Article
Free
Accuracy estimation with clustered dataset
pp 17–22

If the dataset available to machine learning results from cluster sampling (e.g. patients from a sample of hospital wards), the usual cross-validation error rate estimate can lead to biased and misleading results. An adapted cross-validation is ...

Article
Free
Towards automated record linkage
pp 23–31

The field of Record Linkage is concerned with identifying records from one or more datasets which refer to the same underlying entities. Where entity-unique identifiers are not available and errors occur, the process is non-trivial. Many techniques ...

Article
Free
A comparative study of classification methods for microarray data analysis
pp 33–37

In response to the rapid development of DNA Microarray technology, many classification methods have been used for Microarray classification. SVMs, decision trees, Bagging, Boosting and Random Forest are commonly used methods. In this paper, we conduct ...

Article
Free
Data mining in conceptualising active ageing
pp 39–45

The concept of older adults contributing to society in a meaningful way has been termed 'active ageing'. We present applications of data mining techniques on the active ageing data collected via a survey of older australian on a wide range of social and ...

Article
Free
Analysis of breast feeding data using data mining methods
pp 47–52

The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used to collect quantitative information about an item in a ...

Article
Free
Using a kernel: based approach to visualize integrated chronic fatigue syndrome datasets
pp 53–61

We describe the use of a kernel-based approach using the Laplacian matrix to visualize an integrated Chronic Fatigue Syndrome dataset comprising symptom and fatigue questionnaire and patient classification data, complete blood evaluation data and ...

Article
Free
Analyzing harmonic monitoring data using data mining
pp 63–68

Harmonic monitoring has become an important tool for harmonic management in distribution systems. A comprehensive harmonic monitoring program has been designed and implemented on a typical electrical MV distribution system in Australia. The monitoring ...

Article
Free
Discover knowledge from distribution maps using Bayesian networks
pp 69–74

This paper applies a Bayesian network to model multi criteria distribution maps and to discover knowledge contained in spatial data. The procedure consists of three steps: pre processing map data, training the Bayesian Network model using distribution ...

Article
Free
Data mining for lifetime prediction of metallic components
pp 75–81

The ability to accurately predict the lifetime of building components is crucial to optimizing building design, material selection and scheduling of required maintenance. This paper discusses a number of possible data mining methods that can be applied ...

Article
Free
Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text
pp 83–89

An increasing number of language and speech applications are gearing towards the use of texts from online sources as input. Despite such rise, not much work can be found in the aspect of integrated approaches for cleaning dirty texts from online ...

Article
Free
A study of local and global thresholding techniques in text categorization
pp 91–101

Feature Filtering is an approach that is widely used for dimensionality reduction in text categorization. In this approach feature scoring methods are used to evaluate features leading to selection. Thresholding is then applied to select the highest ...

Article
Free
A characterization of wordnet features in Boolean models for text classification
pp 103–109

Supervised text classification is the task of automatically assigning a category label to a previously unlabeled text document. We start with a collection of pre-labeled examples whose assigned categories are used to build a predictive model for each ...

Article
Free
Weighted kernel model for text categorization
pp 111–114

Traditional bag-of-words model and recent word-sequence kernel are two well-known techniques in the field of text categorization. Bag-of-words representation neglects the word order, which could result in less computation accuracy for some types of ...

Article
Free
Visualization of attractive and repulsive zones between variables
pp 115–120

This paper presents a preprocessing step in mining association rules which uses tables to summarize synthetically the way variables interact by highlighting any zones which are attractive. Attractive zones are those which guarantee that potentially ...

Article
Free
On the optimal working set size in serial and parallel support vector machine learning with the decomposition algorithm
pp 121–128

The support vector machine (SVM) is a well-established and accurate supervised learning method for the classification of data in various application fields. The statistical learning task - the so-called training - can be formulated as a quadratic ...

Article
Free
Marking time in sequence mining
pp 129–134

Sequence mining is often conducted over static and temporal datasets as well as over collections of events (episodes). More recently, there has also been a focus on the mining of streaming data. However, while many sequences are associated with absolute ...

Article
Free
Discovering debtor patterns of centrelink customers
pp 135–144

Data mining is currently becoming an increasingly hot research field, but a large gap still remains between the research of data mining and its application in real-world business. As one of the largest data users in Australia, Centrelink has huge volume ...

Article
Free
What types of events provide the strongest evidence that the stock market is affected by company specific news?
pp 145–153

The efficient market hypothesis states that an efficient market immediately incorporates all available information into the price of the traded entity. It is well established that the stock market is not an efficient market as it consists of numerous ...

Article
Free
Investigating the size and value effect in determining performance of Australian listed companies: a neural network approach
pp 155–161

This paper explores the size and value effect in influencing performance of individual companies using backpropagation neural networks. According to existing theory, companies with small market capitalization and high book to market ratios have a ...

Article
Free
Extraction of flat and nested data records from web pages
pp 163–168

This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to ...

Article
Free
Tracking the changes of dynamic web pages in the existence of URL rewriting
pp 169–176

Crawlers in a knowledge management system need to collect and archive documents from websites, and also track the change status of these documents. However, the existence of URL rewriting mechanism raises a page tracking problem since the URLs of a pair ...

Article
Free
A framework of combining Markov model with association rules for predicting web page accesses
pp 177–184

The importance of predicting Web users' behaviour and their next movement has been recognised and discussed by many researchers lately. Association rules and Markov models are the most commonly used approaches for this type of prediction. Association ...

Article
Free
Modeling spread of ideas in online social networks
pp 185–190

Internet based online social networks collectively facilitate the spread of ideas. Hence, to understand how social networks evolve as a function of time, it is critical to learn the relationship between the information dissemination pathways or flows ...

Contributors
  • The Australian National University
  • University of South Australia
  • Western Sydney University
  • The Australian National University

Recommendations

Acceptance Rates

AusDM '06 Paper Acceptance Rate25of58submissions,43%Overall Acceptance Rate98of232submissions,42%
YearSubmittedAcceptedRate
AusDM '12552545%
AusDM '11502244%
AusDM '07692638%
AusDM '06582543%
Overall2329842%