05-10-2016, 12:28 PM
COMPARATIVE ANALYSIS ON THE EFFICIENCY OF DATA MINING TECHNIQUES TO ESTIMATE RETURN OF FINANCIAL
1457871168-DOCUMENT1.doc (Size: 168 KB / Downloads: 4)
ABSTRACT
Identifying profitable trading strategies for the financial instruments traded at exchanges around the world is for a long time a subject of interest for both the academia and practitioners involved in investment activities. One of the biggest challenges of the financial markets is correlating the information about the past with future events. Until present, the efforts of modeling financial market phenomena in the view of getting predictions failed in a disarming proportion.
The research proposed in this thesis aimed at testing the possibilities of optimizing the decision processes in the area of financial investments, through using some technologies through the discover prediction association rules from previous data.
Thus this thesis attempts to verify the utility of some artificial intelligence technologies in the process of automatically acquiring knowledge about the behavior of financial instruments and also to verify the capacity of such technologies of identifying predictable events in the behavior of financial instruments, with the aim of optimizing the investment decisions.
Our system functions by automating the best of these transformations and applying the Apriori algorithm, which is the community standard algorithm for mining association rules.
INTRODUCTION
This thesis investigates the data mining process resulting in a predictor for Financial Market numerical series. The series experimented with come from financial data, which is usually very hard to forecast. One approach to prediction is to spot patterns in the past and to test on more recent data. If a pattern is followed by the same outcome frequently enough, can gain confidence that it is a genuine relationship. Because this approach does not assume any special knowledge or form of the regularities, the method is quite general – applicable to other time series, not just financial.
However, the generality puts strong demands on the pattern detection – as to notice regularities in any of the many possible forms. The thesis’ quest for an automated pattern-spotting involves numerous data mining and optimization techniques: neural networks, decision trees, nearest neighbors, regression, genetic algorithms and other. Comparison of their performance on a stock exchange index data is one of the contributions. As no single technique performed sufficiently well, a number of predictors have been put together, forming a voting ensemble. The vote is diversified not only by different training data – as usually done – but also by a learning method and its parameters.
The algorithm development goes still further: A prediction can only be as good as the training data, therefore the need for good data preprocessing. In particular, new multivariate discretization and attribute selection algorithms are presented. The thesis also includes overviews of prediction pitfalls and possible solutions, as well as of ensemble-building for series data with financial characteristics, such as noise and many attributes.
Factors in Financial Prediction
Some questions of scientific and practical interest concerning financial prediction follow.
Prediction Possibility: Is statistically significant prediction of financial markets data possible? Is profitable prediction of such data possible, what involves answer to the former question, adjusted by constraints imposed by the real markets, such as commissions, liquidity limits, influence of the trades.
Methods: If prediction is possible, what methods are best at performing it? What methods are best-suited for what data characteristics –could it be said in advance?
Meta-methods: What are the ways to improve the methods? Can meta heuristics successful in other domains, such as ensembles or pruning, improve financial prediction?
Data. Can the amount, type of data needed for prediction be characterized? Data preprocessing. Can data transformations that facilitate prediction be identified? In particular, what transformation formulae enhance input data? Are the commonly used financial indicators formulae of any good?
Evaluation. What are the features of sound evaluation procedure, respecting the properties of financial data and the expectations of financial prediction? How to handle rare but important data events, such as crashes? What are the common evaluation pitfalls?
Predictor development: Are there any common features of successful prediction systems? If so, what are they, and how could they be advanced? Can common reasons of failure of financial prediction be identified? Are they intrinsic, non-reparable, or there is a way to amend them?
Transfer to other domains. Can the methods developed for financial prediction benefit other domains?
Predictability estimation: Can financial data be reasonably quickly estimated to be predictable or not, without the investment to build a custom system? What are the methods, what do they actually say, what are their limits?
Consequences of predictability: What are the theoretical and practical consequences of demonstrated predictability of financial data, or the impossibility of it? How a successful prediction method translates into economical models? What could be the social consequences of
financial prediction?
Financial Time Series Properties
One may wonder if there are universal characteristics of the many series coming from markets different in size, location, commodities, sophistication etc.. Moreover, interacting systems in other fields, such as statistical mechanics, suggest that the properties of financial time series loosely depend on the market microstructure and are common to a range of interacting systems.
Such observations have stimulated new models of markets based on analogies with particle systems and brought in new analysis techniques opening the era of econophysics.
Efficient Market Hypothesis (EMH) developed in 1965 initially got wide acceptance in the financial community. It asserts, in weak form, that the current price of an asset already reflects all information obtainable from past prices and assumes that news is promptly incorporated into prices. Since news is assumed unpredictable, so are prices. However, real markets do not obey all the consequences of the hypothesis, e.g., price random walk implies normal distribution, not the observed case; there is a delay while price stabilizes to a new level after news, which among other, lead to a more modern view”. Overall, the best evidence points to the following conclusion. The market isn’t efficient with respect to any of the so-called levels of efficiency.
The value investing phenomenon is inconsistent with semi-strong form efficiency, and the January effect is inconsistent even with weak form efficiency. Overall, the evidence indicates that a great deal of information available at all levels is, at any given time, reflected in stock prices. The market may not be easily beaten, but it appears to be beatable, at least if anyone willing to work at it.”
Distribution of financial series (Cont, 1999) tends to be non-normal, sharp peaked and heavy-tailed, these properties being more pronounced for intraday values. Such observations were pioneered in the 1960s (Mandelbrot,1963), interestingly around the time the EMH was formulated.
Volatility – measured by the standard deviation – also has common characteristics (Tsay, 2002). First, there exist volatility clusters, i.e. volatility may be high for certain periods and low for other. Second, volatility evolves over time in a continuous manner, volatility jumps are rare. Third, volatility does not diverge to infinity but varies within fixed range, which means that it is often stationary. Fourth, volatility reaction to a big price increase seems to differ from reaction to a big price drop.
Extreme values appear more frequently in a financial series as compared to a normally-distributed series of the same variance. This is important to the practitioner since often the values cannot be disregarded as erroneous outliers but must be actively anticipated, because of their magnitude which can influence trading performance.
Scaling property of a time series indicates that the series is self-similar at different time scales (Mantegna & Stanley, 2000). This is common in financial time series, i.e. given a plot of returns without the axis signed, it is next to impossible to say if it represents hourly, daily or monthly changes, since all the plots look similar, with differences appearing at minute resolution. Thus prediction methods developed for one resolution could, in principle, be applied to others. Data frequency refers to how often series values are collected: hourly, daily, weekly etc. Usually, if a financial series provides values on daily, or longer, basis, it is low frequency data, otherwise – when many intraday quotes are included – it is high frequency. Every data includes all individual transactions, and as such, the event-driven time between data points varies creating challenge even for such a simple calculation as correlation.
The minute market microstructure and massive data volume create new problems and possibilities not dealt with by the thesis.
1. DATA MINING FOR FINANCIAL APPLICATIONS
Boris Kovalerchuk Central Washington University, USA Evgenii Vityaev Institute of Mathematics, Russian Academy of Sciences, Russia
It describes data mining in finance by discussing financial tasks, specifics of methodologies and techniques in this data mining area. It includes time dependence, data selection, forecast horizon, measures of success, quality of patterns, hypothesis evaluation, problem ID, method profile, attribute-based and relational methodologies. It also discusses data mining models and practice in finance. It covers use of neural networks in portfolio management, design of interpretable trading rules and discovering money laundering schemes using decision rules and relational data mining methodology.
Forecasting stock market, currency exchange rate, bank bankruptcies, understanding and managing financial risk, trading futures, credit rating, loan management, bank customer profiling, and money laundering analyses are core financial tasks for data mining (Nakhaeizadeh et. al., 2002). Some of these tasks such as bank customer profiling (Berka, 2002) have many similarities with data mining for customer profiling in other fields. Stock market forecasting includes uncovering market trends, planning investment strategies, identifying the best time to purchase the stocks and what stocks to purchase. Financial institutions produce huge datasets that build a foundation for approaching these enormously complex and dynamic problems with data mining tools. Potential significant benefits of solving these problems motivated extensive research for years.
2. Amalgamation of Genetic Selection and Boosting
Stefan Zemke Department of Computer and System Sciences Royal Institute of Technology (KTH) and Stockholm University Forum 100, 164 40 Kista, Sweden
Synopsis comes from research on financial time series prediction (Zemke, 1998). Initially 4 methods, ANN, kNN, Bayesian Classifier and GP, have been compared for accuracy, and the best, kNN, scrutinized by GA-optimizing various parameters. However, the resulting predictors were often unstable. This led to use of bagging (Breiman, 1996) – a majority voting scheme provably reducing variance. The improvement came at no computational cost – instead of taking the best evolved kNN classifier (as defined by its parameters), all above a threshold voted on the class.
Next, a method similar to bagging, but acclaimed better, was tried. AdaBoost (Freund & Schapire, 1996) which works by creating (weighted) ensemble of classifiers – each trained on updated distribution of examples, with those misclassified by previous ensemble getting more weight. A population of classifiers was GA-optimized for minimal error on the training distribution. Once the best individual exceeded threshold it joined the ensemble. After distribution, thus fitness, update, the GA proceeded with the same classifier population, effectively implementing data-classifier coevolution. However, as the distribution drifted from (initial) uniform, GA convergence became problematic. The following averts this by re-building the GA population from the training set after each distribution update. A classifier consists of a list of prototypes, one per class, and binary vector selecting active features for 1-NN determination.
3. Feasibility Study on Short-Term Stock Prediction
Stefan Zemke Department of Computer and System Sciences Royal Institute of Technology (KTH) and Stockholm University Forum 100, 164 40 Kista, Sweden
This paper presents an experimental system predicting a stock exchange index direction of change with up to 76 per cent accuracy. The period concerned varies form 1 to 30 days. The method combines probabilistic and pattern-based approaches into one, highly robust system. It first classifies the past of the time series involved into binary patterns and then, analyzes the recent data pattern and probabilistically assigns a prediction based on the similarity to past patterns.
The tests have been performed on 750 daily index quotes of the Polish stock exchange, with the training data reaching another 400 sessions back. Index changes obtained binary characterization: 1 – for strictly positive changes, 0 – otherwise. Index prediction – the binary function of change between current and future value – was attempted for periods of 1, 3, 5, 10,
OBJECTIVES OF THE STUDY
EXISTING SYSTEM
Financial instruments are often highly complex. An effective financial presentation of the certain risks is therefore vital for the users’, especially for the investors understanding of financial reports fro their decision making processes. The research proposed in this thesis aimed at testing the possibilities of optimizing the decision processes in the area of financial investments, through using some technologies through the discover prediction association rules from previous data.
PROPOSED SYSTEM
The system focuses on investigating the methods and instruments of risk
valuation for financial investments to help on decision making. The thesis talks about the capital markets, the financial instruments used by investors. In today’s business, to have the necessary information on time is very important. The capital markets have the same dynamic as the information technology, being drawn by this. This is why for the financial analysts, the computer and Internet are two very helpful instruments. There have been created and still today are developed computer software to support the investors as they want to have an easy and quick access to the capital markets.
The proposed system describes the capital market and financial market, showing a comparison between the companies, based on the risks present on markets.
WHY USE DATA MINING?
In the modern computer information world, the development of computational algorithms for the identification or extraction of structure from data. This is done in order to help reduce, model, understand, or analyze the data. Tasks supported by data mining include prediction, segmentation, dependency modeling, summarization, and change and deviation detection.
Database systems have brought digital data capture and storage to the mainstream of data processing, leading to the creation of large data warehouses. These are databases whose primary purpose is to gain access to data for analysis and decision support. Traditional manual data analysis and exploration requires highly trained data analysts and is ineffective for high dimensionality (large numbers of variables) and massive data sets.
A data set can be viewed abstractly as a set of records, each consisting of values for a set of dimensions (variables). While data records may exist physically in a database system in a schema that spans many tables, the logical view is of concern here. Databases with many dimensions pose fundamental problems that transcend query execution and optimization.
A fundamental problem is query formulation: How is it possible to provide data access when a user cannot specify the target set exactly, as is required by a conventional database query language such as SQL (Structured Query Language) Decision support queries are difficult to state.
Data mining techniques are fundamentally data reduction and visualization techniques. As the number of dimensions grows, the number of possible combinations of choices for dimensionality reduction explodes. For an analyst exploring models, it is infeasible to go through the various ways of projecting the dimensions or selecting the right sub-samples (reduction along columns and rows).
Data mining is based on machine-based exploration of many of the possibilities before a selected reduced set is presented to the analyst for feedback.
In computer science and data mining, Apriori is a classic algorithm for learning association rules. Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation). Other algorithms are designed for finding association rules in data having no transactions (Winepi and Minepi), or having no timestamps (DNA sequencing).
As is common in association rule mining, given a set of itemsets (for instance, sets of retail transactions, each listing individual items purchased), the algorithm attempts to find subsets, which are common to at least a minimum number C (the cutoff, or confidence threshold) of the itemsets.
Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step known as candidate generation), and groups of candidates are tested against the data. The algorithm terminates when no further successful extensions are found.
Apriori uses breadth-first search and a hash tree structure to count candidate item sets efficiently. It generates candidate item sets of length k from item sets of length k - 1. Then it prunes the candidates which have an infrequent sub pattern.
According to the downward closure lemma, the candidate set contains all frequent k-length item sets. After that, it scans the transaction database to determine frequent item sets among the candidates. For determining frequent items quickly, the algorithm uses a hash tree to store candidate itemsets. This hash tree has item sets at the leaves and hash tables at internal nodes.
The term data mining is not new to statisticians. It is a term synonymous with data dredging or fishing and has been used to describe the process of trawling through data in the hope of identifying patterns.
It has a derogatory connotation because a sufficiently exhaustive search will certainly throw up patterns of some kind - by definition data that are not simply uniform have differences, which can be interpreted as patterns. The trouble is that many of these "patterns" will simply be a product of random fluctuations, and will not represent any underlying structure. The object of data analysis is not to model the fleeting random patterns of the moment, but to model the underlying structures, which give rise to consistent and replicable patterns. To statisticians, then, the term data mining conveys the sense of naive hope vainly struggling against the cold realities of chance.
To other researchers, however, the term is seen in a much more positive light. Stimulated by progress in computer technology and electronic data acquisition, recent decades have seen the growth of huge databases, in fields ranging from supermarket sales and banking, through astronomy, particle physics, chemistry, and medicine, to official and governmental statistics.
These databases are viewed as a resource. It is certain that there is much valuable information in them, information that has not been tapped, and data mining is regarded as providing a set of tools by which that information may be extracted. Looked at in this positive light, it is hardly surprising that the commercial, industrial, and economic possibilities inherent in the notion of extracting information from these large masses of data have attracted considerable interest.
Databases can contain vast quantities of data describing decisions, performance and operations. In many cases the database contains critical information concerning past business performance which could be used to predict the future.
Often the sheer volume of the data can make the extraction of this business information impossible by manual methods. Data mining is a set of techniques, which allows it to do this.
Data mining (also known as Knowledge Discovery) technology helps businesses discover hidden data patterns and provides predictive information, which can be applied to benefit the business.
The basic approach is to access a database of historical data and to identify relationships, which have a bearing on a specific issue, and then extrapolate from these relationships to predict future performance or behavior. The human analyst plays an important role in that only they can decide whether a pattern, rule or function is interesting, relevant and useful to an enterprise.
TECHNICAL BACKGROUND
Data Mining and Knowledge Discovery in Databases are terms used interchangeably. Other terms often used are data or information harvesting, data archeology, functional dependency analysis, knowledge extraction and data pattern analysis.
A high level definition of Data Mining is: the non-trivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data. Data mining is not a simple process and there is no tool that can do the job automatically. Tools can aid data mining, but it requires both human data mining expertise and human domain expertise.
Data mining consists of a number of operations, each of which are supported by a variety of technologies, such as rule induction, neural networks, conceptual clustering. In real world applications information extraction requires the cooperative use of several data mining operations and techniques.
THE BASIC DATA MINING PROCESS IS AS FOLLOWS
1. Define the business objective and expected operational environment of any expected resulting system.
2. Select data. Substantial databases with a meaningful sample of data are required. Selecting data often consists of selecting a time span, geography or product set that we want addressed, and the variables that we want to consider.
3. Transform data. This involves determining how to represent the data for the data-mining algorithm, e.g. age, should it be represented as a member of a set (25-35 year olds) or a straight number.
4. Run the data-mining algorithm or combination of algorithms. Iteration to step 3 or 2 is often needed here. The AI techniques of rule induction and neural networks are used for this machine learning stage.
5. Analyst examines the output data. Often visualization plays an important part in helping the analyst, especially if the analyst needs to present their analysis to others.
6. Present results to the business, in order that the insights can be incorporated into business processes (e.g. through producing output data files or installing data mining software).
Data mining is typically not used as a business system delivery technology. Rather it is an extremely powerful and effective set of technologies for analyzing and clustering data which can be used to form the basis of a system.
APPLICATION
The key reason why Data Mining is such a buzzword at the moment is that because many organizations recognized the need to better understand their customers. Data mining can deliver real world results. Data mining has been used for the following types of applications:
• Understanding purchasing behavior of customers
• Detecting credit card or insurance fraud
• Predicting probable changes in financial markets
Data mining is a powerful technology that converts detail data into competitive intelligence that businesses can use to predict future trends and behaviors. Some vendors define data mining as a tool or as the application of an algorithm to data. The truth is, data mining is not just a tool or algorithm.
Data mining is a process of discovering and interpreting previously unknown patterns in data to solve business problems. Data mining is an iterative process, which means that each cycle further refines the result set. This can be a complex process, but there are tools and approaches available today to help it navigate successfully through the steps of data mining projects.
From an IT perspective, the data mining process requires support for the following activities
• Exploring the data
• Creating the analytic data set
• Building and testing the model
• Integrating the results into business applications
Therefore, the IT organization must provide an environment capable of addressing the following challenges:
• Exploring and pre-processing large data volumes
• Providing sufficient processing power to efficiently analyze many variables (columns) and records (rows) in a timely manner
• Integrating data mining results into the business process
• Creating an extensible and manageable data mining environment
HOW DOES DATA MINING WORK?
Data mining leverages artificial intelligence and statistical techniques to build models. Data mining models are built from situations where we know the outcome. These models are then applied to other situations where we do not know the outcome.
For example, if our data warehouse identifies customers who have responded to past marketing campaigns, it can create a model that identifies the characteristics of those customers. This model can be applied to a wide customer database, identifying customers who demonstrate the same characteristics, allowing it to target those likely to respond, thereby improving response rates and reducing marketing cost.
Business problems that lend themselves to data mining are predictive and descriptive in nature. Predictive models are used to predict an outcome, referred to as the dependent or target variable, based on the value of other variables in the data set.
For example, a predictive model could determine the likelihood that a customer will purchase a product based on her income, number of children, current product ownership, or debt. Predictive techniques build models based on a “training” set of data with a known outcome, such as prior buying patterns.
The algorithm analyzes the values of all input variables and identifies which variables are significant as predictors for a desired outcome.
Unlike predictive models, descriptive models do not predict variables based on known outcomes. Instead, they describe a particular pattern that has no known outcome. Common techniques include data visualization, where large volumes of data are reduced to a picture that can be easily understood. Another common descriptive technique is clustering, where data is grouped into subsets based on common attributes.
For example, it may use descriptive techniques to determine customer segments and their attributes.
In many cases, both descriptive and predictive models are used to solve business problems. A descriptive technique may identify customer segments based on value in terms of profitability to our business, and a predictive technique may identify the likelihood a particular segment will defect to our competitor.
By combining results of the descriptive technique to predict customer defection, it can act to prevent attrition of our high-value customers.
THE DATA MINING PROCESS
It cannot buy a data mining product, apply it to data and expect to generate a meaningful model. Data mining models are built as part of a data mining process an ongoing process requiring maintenance throughout the life of the model.
The data mining process is not linear, but an iterative process where it loop back to the previous phase. For example, the initial model it creates may lead to insight requiring returning back to the data pre-processing phase to create new analytical variables.
THE DATA MINING PROCESS CONTAINS FOUR HIGH-LEVEL STEPS
(1) Define the business problem,
(2) Explore and pre-process the data,
(3) Develop the data model, and
(4) Deploy knowledge.
Although each step is important, most of our time will be spent in the data exploration and pre-processing phase. A well-structured data warehouse can significantly reduce the pain felt in this phase.
In data mining, association rule learners are used to discover elements that co-occur frequently within a data set consisting of multiple independent selections of elements (such as purchasing transactions), and to discover rules, such as implication or correlation, which relate co-occurring elements.
This application of association rule learners is also known as market basket analysis. As with most data mining techniques, the task is to reduce a potentially huge amount of information to a small, understandable set of statistically supported statements.
The input for a typical associations-mining algorithm is a set T of itemsets t, each of which is drawn from a set I of all possible items. Each t is a member of the power set 2I, but T is not considered a subset of 2I since it may contain duplicates (it is a multiset).
Since it is typically large, the general problem of finding all common subsets in an arbitrary selection of itemsets is considered intractable. Therefore input sets in T, and any results derived there from, are typically assumed to be small. It is an ongoing area of research to find algorithms which relax this assumption and allow processing of larger sets.
SOFTWARE SPECIFICATION
Introduction To .NET
The .NET Framework:
The .NET Framework represents a unified, Object-Oriented set of services and libraries that embrace the changing role of new network-centric and network aware software. In fact, the .NET Framework is the first platform designed from the ground up with the Internet in mind. The .NET Framework is a common environment for building, deploying and running Web Applications.
The .NET Framework contains common class libraries-like ADO.NET, ASP.NET and Windows Forms-to provides advanced standard services that can be integrated into a variety of computer systems. The .NET Framework is the infrastructure for the new Microsoft .NET Platform.
The .NET framework is language neutral. Currently it supports C++, C#, VisualBasic, Jscript (The Microsoft version of JavaScript) and COBOL. Third-party languages-like Eiffel, Perl, Python, Smalltalk and others- will also be available for building future .NET Framework applications.
The new Visual Studio.NET is a common development environment for the new .NET framework. It provides a feature – rich application execution environment, simplified development and easy integration between a numbers of different development languages.
A .NET Framework is a set of technologies for developing and using components to create Web Forms, Web Services, Windows Applications. It supports the software lifecycle like Development, Debugging, Deployment and Maintenance.
Benefits of the .NET Framework:
The .Net Framework offers a numbers of benefits to developers:
• A consistent programming model
• Direct support for security
• Simplified development efforts
• Easy application deployment and maintenance
In its lifetime, VB in the past has varied from an object-based development tool to an object oriented development tool, but previous versions of Visual Basic had still been missing a few key features. Using previous versions of VB, programmers have been able to create classes and use those classes in building applications. But VB was missing key object-oriented development features.
With VB.NET, objects also have a lifecycle, but things are not quite the same as in the that runs as a DLL is loaded, and the concept of the Class_Terminate event changes rather substantially. However, the concept behind the Class_Initialize event is morphed into a full-blown constructor method that accepts parameters.
Construction
Object construction is triggered any time we create a new instance of a class. This is done using the New keyword - a level of consistency that didn't exist with VB6 where we got to choose between New and CreateObject.
Sub Main
Since VB6 was based on COM, creating an object could trigger a Sub Main procedure to be run. This would happen the first time an object was created from a given component - often a DLL. Before even attempting to create the object, the VB6 runtime would load the DLL and run the Sub Main procedure.
The .NET Common Language Runtime doesn't treat components quite the same way, and so neither does VB.NET. This means that no Sub Main procedure is called as a component is loaded. In fact, Sub Main is only used once - when an application itself is first started.
As further components are loaded by the application, only code within the classes we invoke is called. It wasn't that wise to rely on Sub Main even in VB6, since that code would run prior to all the error handling infrastructure being in place. Bugs in Sub Main were notoriously difficult to debug in VB6. If we do have to use code that relies heavily on the Sub Main concept for initialization, we'll need to implement a workaround in VB.NET.
Termination
In VB6 an object was destroyed when its last reference was removed. In other words, when no other code had any reference to an object, the object would be automatically destroyed - triggering a call to its Class_Terminate event. This approach was implemented through reference counting - keeping a count of how many clients had a reference to each object - and was a direct product of VB's close relationship with COM.
While this behavior was nice - since we always knew an object would be destroyed immediately and we could The clear termination scheme used in VB6 is an example of deterministic finalization. It was always very clear when an object would be terminated.
Unlike COM, the .NET runtime does not use reference counting to determine when an object should be terminated. Instead it uses a scheme known as garbage collection to terminate objects. This means that in VB.NET we do not have deterministic finalization, so it is not possible to predict exactly when an object will be destroyed. Let's discuss garbage collection and the termination of VB.NET objects in more detail.
Garbage Collection
In .NET, reference counting is not part of the infrastructure. Instead, objects are destroyed through a garbage collection mechanism. At certain times (based on specific rules), a task will run through all of our objects looking for those that no longer have any references. Those objects are then terminated; the garbage collected.
This means that we can't tell exactly when an object will really be finally destroyed. Just because we eliminate all references to an object doesn't mean it will be terminated immediately. It will just hang out in memory until the garbage collection process gets around to locating and destroying it.
The major benefit of garbage collection is that it eliminates the circular reference issues found with reference counting. If two objects have references to each other, and no other code has any references to either object, the garbage collector will discover and terminate them, whereas in COM these objects would have sat in memory forever.
There is also a potential performance benefit from garbage collection. Rather than expending the effort to destroy objects as they are dereferenced, with garbage collection this destruction process typically occurs when the application is otherwise idle - often decreasing the impact on the impact on the user. However, garbage collection may also occur with the application is active in the case that the system starts running low on resources.
OOP in Visual Basic .NET
Visual Basic .NET is not Visual Basic 6 with inheritance tacked onto it. Rather, Visual Basic .NET has been entirely rewritten to be fully object-oriented. In fact, everything in Visual Basic .NET can be treated as an object. Yes, even your strings and integers can be accessed as objects in Visual Basic .NET.
The first hint that your integer is treated as an object is the list of properties and methods that appear when you type the dot after the i. Select one of the properties, such as the MinValue shown in this example. Then run the application and you will get a message box containing the value of the selected integer property.
.NET has predefined classes for the intrinsic data types, but what about the classes that you create? Let's walk through an example to demonstrate how to create classes in Visual Basic .NET and inherit from them to leverage some of the new OOP features.
Object – Oriented Features
Objects serve as the building blocks in an OOPS language. An object has a unique identity and displays unique behavior. An example of an object from the world around us is a Car, a ball or a Clock. In a programming language an Object is defined as an instance of a Class. All applications created in OOPS language are made up of objects.
A programming language qualifies as an OOPS language if it supports the following features.
• Abstraction.
• Encapsulation
• Inheritance.
• Polymorphism.
Abstraction: VB has supported abstraction since VB4. Abstraction is merely the ability of a language to create "black box" code - to take a concept and create an abstract representation of that concept within a program. A Customer object, for instance, is an abstract representation of a real-world customer. A Record set object is an abstract representation of a set of data.
Encapsulation.: This has also been with us since version 4.0. It's the concept of a separation between interface and implementation. The idea is that we can create an interface (Public methods in a class) and, as long as that interface remains consistent, the application can interact with our objects. This remains true even if we entirely rewrite the code within a given method - thus the interface is independent of the implementation.
Encapsulation allows us to hide the internal implementation details of a class. For example, the algorithm we use to compute Pi might be proprietary. We can expose a simple API to the end user, but we hide all of the logic used for our algorithm by encapsulating it within our class.
• Polymorphism. Likewise, polymorphism was introduced with VB4. Polymorphism is reflected in the ability to write one routine that can operate on objects from more than one class - treating different objects from different classes in exactly the same way. For instance, if both Customer and Vendor objects have a Name property, and we can write a routine that calls the Name property regardless of whether we're using a Customer or Vendor object, then we have polymorphism.
VB, in fact, supports polymorphism in two ways - through late binding (much like Smalltalk, a classic example of a true OO language) and through the implementation of multiple interfaces. This flexibility is very powerful and is preserved within VB.NET.
• Inheritance. VB.NET is the first version of VB that supports inheritance. Inheritance is the idea that a class can gain the pre-existing interface and behaviors of an existing class. This is done by inheriting these behaviors from the existing class through a process known as sub classing. With the introduction of full inheritance, VB is now a fully OO language by any reasonable definition.
METHODOLOGY
APRIORI
Definition :
Apriori is an influential algorithm for mining frequent itemsets for Boolean association rules.The name of the algorithm is based on the fact that the algorithm uses prior knowledge of frequent itemset properties . Apriori employs an iterative approach known as a level-wise search.
In computer science and data mining, Apriori is a classic algorithm for learning association rules. Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation). Other algorithms are designed for finding association rules in data having no transactions or having no timestamps .
As is common in association rule mining, given a set of itemsets (for instance, sets of retail transactions, each listing individual items purchased), the algorithm attempts to find subsets which are common to at least a minimum number C of the itemsets. Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step known as candidate generation), and groups of candidates are tested against the data. The algorithm terminates when no further successful extensions are found.
Apriori uses breadth-first search and a tree structure to count candidate item sets efficiently. It generates candidate item sets of length k from item sets of length k − 1. Then it prunes the candidates which have an infrequent sub pattern. According to the downward closure lemma, the candidate set contains all frequent k-length item sets. After that, it scans the transaction database to determine frequent item sets among the candidates.
Apriori, while historically significant, suffers from a number of inefficiencies or trade-offs, which have spawned other algorithms. Candidate generation generates large numbers of subsets (the algorithm attempts to load up the candidate set with as many as possible before each scan). Bottom-up subset exploration (essentially a breadth-first traversal of the subset lattice) finds any maximal subset S only after all 2 | S | − 1 of its proper subsets.
Algorithm
Association rule mining is to find out association rules that satisfy the predefined minimum support and confidence from a given database. The problem is usually decomposed into two subproblems. One is to find those itemsets whose occurrences exceed a predefined threshold in the database; those itemsets are called frequent or large itemsets. The second problem is to generate association rules from those large itemsets with the constraints of minimal confidence. Suppose one of the large itemsets is Lk, Lk = {I1, I2, … , Ik}, association rules with this itemsets are generated in the following way: the first rule is {I1, I2, … , Ik-1}⇒ {Ik}, by checking the confidence this rule can be determined as interesting or not. Then other rule are generated by deleting the last items in the antecedent and inserting it to the consequent, further the confidences of the new rules are checked to determine the interestingness of them. Those processes iterated until the antecedent becomes empty. Since the second subproblem is quite straight forward, most of the researches focus on the first subproblem. The Apriori algorithm finds the frequent sets L In Database D.
• Find frequent set Lk − 1.
• Join Step.
o Ck is generated by joining Lk − 1with itself
• Prune Step.
o Any (k − 1) -itemset that is not frequent cannot be a subset of a frequent k -itemset, hence should be removed.
where
• (Ck: Candidate itemset of size k)
• (Lk: frequent itemset of size k)
PERFORMANCE EVALUATION
The daily trade data of the listed companies from Stock Exchange is collected for technical analysis with the means of neural networks. Two learning algorithms and two weight initializations are compared. The results find that neural networks can model the time series satisfactorily, whatever which learning algorithm and weight initialization are adopted. However, the proposed conjugate gradient with Apriori weight initialization requires a lower computation cost and learns better than steepest decent with random initialization.
Detecting trends of stock data is a decision support process. Although the Random Walk Theory claims that price changes are serially independent, traders and certain academics have observed that there is no efficient market. The movements of market price are not random and predictable.
Statistical methods and neural networks are commonly used for time series prediction. Empirical results have shown that Neural Networks outperform linear regression since stock markets are complex, nonlinear, dynamic and chaotic. Neural networks are reliable for modeling nonlinear, dynamic market signals. Neural Network makes very few assumptions as opposed to normality assumptions commonly found in statistical methods. Neural network can perform prediction after learning the underlying relationship between the input variables and outputs. From a statistician’s point of view, neural networks are analogous to nonparametric, nonlinear regression models.
Proper evaluation is crucial to a prediction system development. First, it has to measure exactly the interesting effect, e.g. trading return as opposed to related, but not identical, prediction accuracy.
Second, it has to be sensitive enough as to spot even minor gains. Third, it has to convince that the gains are no merely a coincidence. Usually prediction performance is compared against published results.
Although, having its problems, such as data over fitting and accidental successes due to multiple (worldwide!) trials, this approach works well as long as everyone uses the same data and evaluation procedure, so meaningful comparisons are possible.
However, when no agreed benchmark is available, as in the financial domain, another approach must be adopted. Since the main question concerning financial data is whether prediction is at all possible, it suffices to compare a predictor’s performance against the intrinsic growth of a series – also referred to as the buy and hold strategy.
Then a statistical test can judge if there is a significant improvement.
Evaluation Data
To reasonably test a prediction system, the data must include different trends, assets for which the system is to perform, and to be plentiful to warrant significant conclusions. Over fitting a system to data is a real danger. Dividing data into three disjoint sets is the first precaution. Training portion of the data is used to build the predictor. If the predictor involves some parameters which need to be tuned, they can be adjusted as to maximize performance on the validation part. Now, the system parameters frozen, its performance on an unseen test set provides the final performance estimation. In multiple tests, the significance level should be adjusted, e.g. if 10 tests are run and the best appears 99.9% significant, it really is 99.9%10 = 99%. If we want the system to predict the future of a time series, it is important to maintain proper time relation between the training, validation and test sets – basically training should involve instances time-preceding any test data.
Evaluation Measures
Financial forecasts are often developed to support semi-automated trading (profitability), whereas the algorithms used in those systems might have originally different objectives. Accuracy – percentage of correct discrete (e.g. up/down) predictions – is a common measure for discrete systems, e.g. ILP/decision trees. Square error – sum of squared deviations from actual outputs – is a common measure in numerical prediction, e.g. ANN.
Performance measure – incorporating both the predictor and the trading model it is going to benefit – is preferable and ideally should measure exactly what we are interested in, e.g. commission and risk adjusted return, not just return. Actually, many systems’ ’profitability’ disappears once the commissions are taken into account.
Evaluation Procedure
In data sets, where instance order does not matter, the N-cross validation – data divided into N disjoint parts, N−1 for training and 1 for testing, error averaged over all N is a standard approach. However, in the case of time series data, it underestimates error because in order to train a predictor we sometimes use the data that comes after the test instances unlike in real life, where predictor knows only the past, not the future. For series, sliding window approach is more adept: a window/segment of consecutive instances used for training and a following segment for testing, the windows sliding over all data, as statistics collected.
The Task
Some evidence suggests that markets with lower trading volume are easier to predict Since the task of the study is to compare ML techniques, data from the relatively small and scientifically unexplored is used, with the quotes, from the opening of the exchange in 1991, freely available on the Internet. At the exchange, prices are set once a day (with intraday trading introduced more recently). The main index, is a capitalization weighted average of all the stocks traded on the main floor, and provides the time series used in this study.
The learning task involves predicting the relative index value 5 quotes ahead, i.e., a binary decision whether the index value one trading week ahead will be up or down in relation to the current value. The interpretation of up and down is such that they are equally frequent in the data set, with down also including small index gains. This facilitates detection of above-random predictions – their accuracy, as measured by the proportion of correctly predicted changes, is 0.5 + s, where s is the threshold for the required significance level. For the data including 1200 index quotes, the following table presents the s values for one-sided 95% significance, assuming that 1200 – Window Size data points are used for the accuracy estimate.