I am struggling with statistical analysis of a dataset. I would appreciate help from the community. Let me describe what I am looking for.
1. The linear regression model that I am trying to use is Y=b1X1 + b2x2 + e. I am interested in estimating b1 and b2 so that the estimated error is the least.
2. I am trying to randomly select 25 percent of observations from the dataset and estimate b1 and b2.
3. The estimated coefficients are used to predict the Y values for the remainder of the 75 percent of the data and calculate the error. The idea is to use a subset to estimate coefficients, and check for the robustness of the estimates.
4. The process is iterated for say 100 times.
5. Each time, I would like to store the statistical results and export them to an excel file.
For some reason, each time I struggle with one or two steps mentioned above. Could anyone help me with the approach?