RESEARCH ON FINANCIAL TIME SERIES FORECASTING BASED ON SVM YANG YUJUN1,2,3, YANG YIMEI2, *, LI JIANPING1 1 School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China 2 School of Computer Science and Engineering, Huaihua University, Huaihua, 418008, China 3 Hunan Provincial Key Laboratory of Ecological Agriculture Intelligent Control Technology, Huaihua, 418008, China Email: yym@hhtc.edu.cn the financial time series. The experimental results show the prediction accuracy of this approach based on the support vector machine. Abstract: The support vector machine (SVM) is a machine learning method developed based on statistical learning theory. The SVM is widely used in classification and prediction. Since the financial time series is complex, the traditional forecasting methods are less reliable. In this paper, we research on financial time series forecasting based on the support vector machine. Although the speed of prediction process is slow, it can improve the prediction accuracy of the financial time series. The experimental results show the prediction accuracy of this approach based on the support vector machine. 2. Support vector machine (SVM) was developed by Vapnik and his colleagues. SVM emerged from research in statistical learning theory on how to regulate generalization in learning, and to trade off structural complexity against empirical risk. Support vector machine for nonlinear separable case, we can use a nonlinear function M (x) to map data into a high dimensional feature space, then to create optimized hyper-plane in feature space. The equation of this super plane can be expressed as follows: ˄1˅ WT x b 0 Sigmoid function is used to map the independent variables to (0,1), the value of the map is considered to belong to the probability 1. Assuming that there is a function Keywords: SVM; Forecasting; Time series; Support vector machine; 1. Introduction In many practical applications, SVM has displayed its lots of outstanding ability, especially in prediction and classification problems. In real-world applications, many of the problems are complex, inherently noisy, non-stationary and deterministically chaotic [1]. To reduce the complexity space and time of the support vector machine, many improved algorithm has been applied successfully. One method is to obtain low-order approximation of the nuclear matrix by greedy algorithm [2], or sample [3], or decomposition matrix. Another method is to improve the efficiency of SVM algorithm block. A third approach is to avoid quadratic programming problems, such as central support vector machine algorithm [4], the scale [5] method. The financial time series are complex and inherently noisy time series. The noisy characteristic refers to the unavailability of complete information from the past behavior of financial markets to fully capture the dependency between future and past prices. Therefore, stock market prediction is regarded as one of the most challenging tasks of time series forecasting. In this paper, we study the financial time series forecasting based on the support vector machine. Although the speed of prediction process is slow, it can improve the prediction accuracy of 978-1-5090-6126-6/16/$31.00 2016 IEEE SVM method theory hT ( x) g (T T x) 1 1 e-T T x (2) Where x is the n-dimensional feature vector, the function g is logistic function. The probability of the function belongs to 1. P(y 1| x;T ) P( y hT ( x) 0 | x;T ) 1 hT ( x) (3) Thus, when we judge a new feature which belong to the class, we only need to calculate the value of hθ(x). If it is greater than 0.5 it belongs to the y=1 class. In the meantime, if it is less than 0.5 it belongs to the y=0 class correspondingly. 3. SVM Model Support vector machine is based on a linear division. 346 But you can imagine, not all data can be linear division. Here is a simple example, assuming that there is a two-dimensional plane, there are two different types of data on the plane respectively. Because these data are linearly separable, so we can use a line to separate the two kinds of data. This line is equivalent to a hyperplane, the corresponding hyperplane side of the data points of the Y is -1 and the other side of the Y is 1. Because the above sample is linearly separable, the original spaced requirements can be met. In the determined super plane w*x+b=0, |w*x+b| can express the distance from point x to hyperplane. By observing whether the symbol of the w*x+b and the class marked y can be judged correctly, we can determine whether the classification is correct or not and can determine the correctness of the classification using (y* (w*x+b)). So we have the concept of function interval (margin functional). Define function interval as following. J y(ZT x b) 4. yf ( x) Fig.1 the daily opening price of HIS ˄4˅ Experimental data and empirical results analysis To indicate the prediction accuracy of this approach based on the support vector machine, we choose two indices from real stock market as experimental data. The two indices are the SP500 and HSI. Here the SP500 is short for the Standard & Poor's 500 which is an American stock market index based on the market capitalizations of 500 large companies stock. The HSI is short for the Hang Seng Index in Hong Kong of China. The original sample data which cover 3762 days from the period July 9, 2001 to December 24, 2015 are obtained from Finance of Yahoo free of charge. The data sets of every index include five daily properties: opening price, highest price, lowest price, closing price and transaction volume. Fig.1 and Fig.2 show the daily opening price of HSI and SP500 separately. Firstly, we select the first to 3761st the daily opening price of trading days as the independent variable of SVM function. Correspondingly, we select the second to 3762nd the daily opening price of trading days as the dependent variable of SVM function. In order to facilitate the calculation, we implement the normalization of all data and map all the data to the range between 1 and 2. Fig.2 the daily opening price of SP500 Fig.3 the contour of parameter selection result of SVR 347 In order to achieve the best prediction results, we first look for a better regression parameters c and g. Fig.3 and Fig.4 show the parameter selection result of SVR. We use the regression analysis to get the best parameters for the SVM network training. Fig.6 the result of final regression prediction error Fig.4 the 3D view of parameter selection result of SVR We use the best parameters of c and g to train the SVM and perform regression prediction on daily opening price of the HSI and SP500. The final regression prediction, error and relative error results of SP500 show in Fig.5, Fig.6 and Fig.7 separately. To display more clearly the prediction result of the Fig.5, we enlarge the part range from 2780 to 2900 of the Fig.5 in Fig.8. Fig.7 the result of final regression prediction relative error Fig.5 the result of final regression prediction Fig.8 the part result of final regression prediction 348 5. Conclusions Since the financial time series is complex, the traditional forecasting methods are less reliable. In this paper, we study on financial time series forecasting based on the support vector machine. In this experiment, we choose two indices of SP500 and HSI from real stock market as experimental data. The experimental results show the prediction accuracy of this approach based on the support vector machine. We analysis the result of final regression prediction and find that the approach has better effect in the mature market than in the immature market. Acknowledgements This work is supported by the National Natural Science Foundation of China (No.61370073), Sichuan Province Science and technology support program (No.2013GZX0165), the Constructing Program of the Key Discipline in Huaihua University, the Scientific Research Fund of Hunan Provincial Education (No.14C0886), the Science and Technology Plan Projects of Huaihua City, the Key Laboratory of Intelligent Control Technology for Wuling-Mountain Ecological Agriculture in Hunan Province (No.ZNKZ2014-9). Fig.9 the result of final regression prediction error for HSI References [1] Bin Gui, Xianghe Wei, “Financial Time Series Forecasting Using Support Vector Machine”, 2014 Tenth International Conference on Computational Intelligence and Security, pp. 39-44, 2014. [2] Mingjun Song, Rajasekaran, S., “A greedy algorithm for gene selection based on SVM and correlation”, International Journal of Bioinformatics Research and Applications , Vol 6, No. 3, pp. 296-307, 2010. [3] Huichuan Duan, Naiwen Liu, “A Greedy Search Algorithm for Resolving the Lowermost C Threshold in SVM Classification”, 2013 Ninth International Conference on Computational Intelligence and Security, Leshan, China, pp. 190-193, Dec. 2013. [4] Du, Jing-Yi. Hou, Yuan-Bin, “Greedy algorithms with Kernel matrix approximation”, Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, Vol 20, No. 1, pp. 138-143, Feb. 2007. [5] Shen Zhaoqing, Peng Yuhua, and Shu Ning, “A Road Damage Identification Method based on Scale-span Image and SVM”, Geomatics and Information Science of Wuhan University, Vol 38, No. 8, pp. 993-997, Aug. 2013. Fig.10 the result of relative error for HSI prediction price In order to compare the effect of the approach based on the support vector machine applied to different data, we give the result of final regression prediction relative error in Fig.10 and the result of final regression prediction error in Fig.9. Observing the Fig.10 and Fig.7, we find that the relative error of Fig.7 is less than the relative error of Fig. 10. Here the Fig.10 and Fig.7 show the relative error of prediction price for HSI and SP500 separately. It is generally known that HSI is a stock market index of China and SP500 is an American stock market index. Since American stock market is more mature than Chinese stock market, the results of Fig.10 and Fig.7 show that the approach has better effect in the mature market than in the immature market. 349

1/--страниц