β-Spline Estimation in a Semiparametric Regression Model with Nonlinear Time Series Errors

We study the estimation problems for a partly line ar regression model with a nonlinear time series error structure. The model consists of a par metric linear component for the regression coeffic i nts and a nonparametric nonlinear component. The random errors are unobservable and modeled by a firstorder Markov bilinear process. Based on a B-spline series approximation of the nonlinear component function, we propose a semiparametric ordinary leas t squares estimator and a semiparametric generalize d least squares estimator of the regression coefficie nts, a least squares estimator of the autoregressio n parameter for the errors, and a B-spline series est imator of the nonparametric component function. The asymptotic properties of these estimators are inves tigated and their asymptotic distributions are deri ved. We also provide a consistent estimator for the asympto tic c variance matrix of the semiparametric general ized least squares estimator of the regression coefficie nts. Our results can be used to make asymptotically efficient statistical inferences. In addition, a sm all simulation is conducted to evaluate the perform ance of the proposed estimators, which shows that the semi parametric generalized least squares estimator of t he regression coefficients is more efficient than the semiparametric ordinary least squares estimator.


INTRODUCTION
Partly linear regression models have attracted a lot of research interests due to their flexibility to allow both linear and nonlinear components, as well as serially correlated errors, which enables them to better describe increasingly complex data from the real world than pure parametric or nonparametric models. The model in (1.1) has been extensively studied by many researchers. A brief review of relevant literature is as follows. When i ε are i.i.d. Random variables [1][2][3][4][5] used various estimation methods, such as the kernel method, spline method, series estimation, local linear estimation, M-estimation and two-stage estimation, to obtain estimates of the unknown quantities in (1.1). They also discussed the asymptotic properties of these estimators. However, the independence assumption for the errors is not always appropriate in applications, especially for sequentially collected economic data, which often exhibit evident serial dependence in the errors. For example, in the process of fitting the relationship between temperature and electricity usage [6] found that the data are serially correlated. When i ε is an autoregressive (AR) series [7] studied an estimator of the autocorrelation coefficient [8] considered the estimation problem for model (1.1) with linear time series errors.
It is well known that not all correlated errors can be fitted well by linear time series errors. Therefore, much attention has been shifted to nonlinear time series models in the recent literature [9] and the references therein. There have been many papers concerned with the ordinary linear models with nonlinear time series errors. For example, under the assumption of random coefficient autoregressive errors [10] investigated the limit distribution of the least squares estimators of the regression and auto regression parameters. Moreover [11] used the framework of the information matrix (IM) test to develop a test for the linear regression model when the errors are from an autoregressive conditional heteroscedastic (ARCH) process [12] . Derived the Wald and Rao's Score test statistics for testing the effects of additional regression parameters. There has, however, been little work on the partial linear regression model with nonlinear time series errors in the literature except [13] . [13] investigated the estimating problems of partial linear regression models with random coefficient autoregressive errors.
In this study, by approximating the nonparametric component with B-spline series we study the problems of estimating the parametric and nonparametric components of the partial linear regression model (1.1) with a nonlinear time series error structure. More specifically, we consider a first-order Markovian bilinear error process i ε , which is a stationary solution of : where, {e i } is a zero mean process consisting of i.i.d.
random variables with finite second moments Obviously, the model (1.2) includes the usual AR (1) structure. [14] discussed the estimation and test problems of the ordinary linear regression model with error structure (1.2). Model (1.2) has been extensively discussed in the control theory literature [15,16] . [17] applied model (1.2) to study the well-known Wolfer sunspot numbers for the years from 1700 to 1955 and a seismic record obtained from an underground nuclear explosion that was carried out in the USA on October 29th, 1966. Recently, model (1.2) has been extended to the case of space time [18,19] . More references about the theoretical results, applications and the extensions of the model (1.2) can be found in the monograph of [9] .
Based on the approximation of ( ) g ⋅ by a B-spline series and least squares estimation, we construct the following estimators for model (1.1) with error structure (1. We will further investigate the asymptotic properties of these estimators and derive their limiting distributions. In addition, a small simulation is conducted to evaluate the performance of the estimators.
Estimators: Throughout this study we will assume that the design points x i and t i are fixed, and they are related via: The reasonableness of this relation can be found in [2] . In addition, suppose that the vector (1,…,1)' is not in the space spanned by the column vectors of X = (x 1 ,…, x n )', which ensures the identifiability of model (1.1) according to [1] . It is also assumed that the sequence of designs t i forms an asymptotically regular sequence [20] in the sense that: This ˆn β is called the semiparametric ordinary least squares estimator When the errors are correlated, the SOLSE ˆn β is not asymptotically efficient as it ignores the correlation. Hence we propose a semiparametric generalized least square estimator (SGLSE) of β . Since for a given φ : , we define a SGLSE as follows: Where: when n is large. Therefore, without loss of generality, we can assume that the inverses of these two matrices exist.
based on the estimated residuals: Consequently, we define our SGLSE of β , denoted by ˆG β , as: Based on this SGLSE ˆG β , we can construct the following estimator of the nonparametric component Furthermore, an estimator of the asymptotic covariance matrix of ˆG β is given by 1 Large sample properties: We begin with the following assumptions required to derive the main results, which are quite mild and can be easily satisfied (Remark 2 below).

Remark 2:
The above u ij behave like zero mean, uncorrelated random variables and h j (t i ) are the regression of x ij on t i . Specifically, suppose that the design points (x i , t i ) are i.i.d. Random variables, and let Then by the law of large numbers, (3.1) holds with probability 1. Moreover, according to [22] (3.2) holds when u ij behave like zero mean, uncorrelated random variables. Assumption 2 is mild and holds for most commonly used functions, such as the polynomial and trigonometric functions. The first theorem below shows the asymptotic normality of the SOLSE n β . (1 )   (1 ) which is recursively defined by: number [23] . The following theorem establishes the asymptotic normality of the B-spline series estimator ˆ( ) .
For inference about β based on the asymptotic distribution of β G , an estimator of its asymptotic covariance matrix is needed. Let 2 Σ be given by (2.6)--(2.7). Then we have the following result.

Remark 3:
By applying the tensor-product B-spline technique [24] the above results can be easily extended to the case of multivariate regressor t.
A simulation study: This presents a simulation study to evaluate the finite sample performance of the estimators. The observations are generated from: values are generated once for a fixed φ value) and estimate φ , β and ( ) g ⋅ for each sample. We here use the uniform knots. According to [24] uniform knots are usually sufficient when the function ( ) g ⋅ does not exhibit dramatic changes in its derivatives. Thus, we just need to determine the number of knots to use. We use the method in [24] to do so. Biases and sample variances (Var) of the simulated estimates are given in Tables 1 and 2 Table 2: Simulated biases and variances of the estimators for ( ) g ⋅

T g(t)
Bias(ˆn g ) Var(ˆn g ) Bias(ˆG g ) Var(ˆG g ) From Table 1 we can see that in all cases the semiparametric generalized least squares estimator β G has smaller biassemiparametrices than the semiparametric ordinary least squares estimator β n . The advantage of β G over β n is more significant when φ is large (high serial correlation), as one would expect since β G takes the serial correlation into account whereas β n does not.
Moreover, as φ increases, the bias and variance of β G decrease, but this is not the case for β n . In addition, the performance of β G is close to that of the φ-known semiparametric generalized least squares estimator β G across the values of φ. Table 1 also shows that the estimator ˆn φ of φ is adequate.
From Table 2 we can see that the nonparametric estimator ˆG g based on the semiparametric generalized least squares estimator β G is better than the nonparametric estimator ˆn g based on the semiparametric ordinary least squares estimator β n in terms of bias and variance.

CONCLUSION
In this article we have studied the estimation problem of a partly linear regression model with bilinear time series errors. Using B-splines to approximate the nonparametric component, we have constructed the semiparametric ordinary and generalized least squares estimators of the parametric component, the least squares estimator of the autoregressive parameter, and the B-spline series estimator of the nonparametric component. We have also derived the asymptotic normality for these estimates. In both theory and simulation, we have demonstrated that the semiparametric generalized least squares estimator is more efficient than the semiparametric ordinary least squares estimator.

Appendix: Proofs of Theorems:
In order to prove the theorems presented earlier we first introduce several lemmas. The proofs of Lemmas 1 and 2 can be found [25] .
Proof: According to the property of { } i ε and Lemmdifficult [26] , it is not difficulty to complete the proof.
We are now ready to prove the main results.
Proof of Theorem 3.1: By (2.1) it is easy to see that The conclusion of Theorem 3.1 then follows from Lemma 3 and Assumption 1.

Proof of Theorem 3.2:
It is easy to see that the following equation holds: According to the definition of ε i in (2.3), we have: