|
前情回顾
地区劳动力流动的引力模型:
Lflow_{ij} = lnL_i × ln(w_j - w_i) × ln(hp_j - hp_i) × D_{ij}^{-2}
- Lflow_{ij} 表示从 i 地区流动到 j 地区的劳动力数量
- L_i 表示 i 地区的劳动力数量,采用各地级市年末人口数
- w 表示相应地区城镇单位就业人员的平均工资水平
- hp 表示该地区的房价水平,采用住宅平均销售价格
- D 表示各地区之间的地理距离,用各地区经纬度数据测算得出
Lflow_{i} = \sum_{j=1}^n Lflow_{ij} 数据来源
国泰安。
- 下载表名 分城市人口、就业与工资
- 数据区间 1978 至 2020
- 选择代码 全部代码
- 输出类型 Excel2007格式(*.xlsx)
- 选择字段 年度标识[Sgnyea] 城市名称[Ctnm] 城市代码[Ctnm_id] 城市类别[Cttyp] 省份标识[Prvcnm_id] 省份名称[Prvcnm] 年末总人口[Eect01] 年末非农业人口[Eect02] 人口自然增长率(‰)[Eect03] 人口密度[Eect04] 从业人员[Eect05] 城镇个体劳动者[Eect06] 第一产业从业人员比重(%)[Eect08] 第二产业从业人员比重(%)[Eect09] 第三产业从业人员比重(%)[Eect10] 全部职工年均人数[Eect11] 全部职工工资总额[Eect12] 职工平均工资[Eect13]
- -
- 下载表名 分省份按用途分商品房屋平均销售价格
- 数据区间 1997 至 2021
- 选择代码 全部代码
- 输出类型 Excel2007格式(*.xlsx)
- 选择字段 年度标识[Sgnyea] 省份编码[Prvcnm_id] 省份名称[Prvcnm] 房屋平均销售价格[Ifa0701] 住宅平均销售价格[Ifa0702] 别墅、高档公寓平均销售价格[Ifa0703] 经济适用房平均销售价格[Ifa0704] 办公楼平均销售价格[Ifa0705] 商业营业用房平均销售价格[Ifa0706] 其他平均销售价格[Ifa0707]
经纬度来自:
清洗数据
cap pr drop A
pr def A
cd D:\Download
foreach var of var * {
label variable `var' "`=`var'[1]'"
replace `var' = "" if _n == 1
}
drop in 1/2
end
********
import excel "D:\Download\分城市人口、就业与工资094426183\CRE_Eplwagct.xlsx", sheet("sheet1") firstrow clear
A
ren Sg year
destring y *id E*, force replace
g lnL = ln(Eect01*10000)
ren Eect13 w
drop E*
save 1, replace
********
import excel "D:\Download\分省份按用途分商品房屋平均销售价格095828171\CRE_Ifa07.xlsx", sheet("sheet1") firstrow clear
A
ren Sg year
destring y *id I*, force replace
ren Ifa0702 hp
drop I*
save 2, replace
********
import excel "D:\Download\分地区经纬度表(年)210144623\EG_longlatitudeY.xlsx", sheet("sheet1") firstrow clear
A
destring l*, force replace
drop if lo == .
drop S
bys AreaC: egen lo = mean(lo)
bys AreaC: egen la = mean(la)
keep A* lo la
duplicates drop
destring AreaC, force replace
ren AreaC Ctnm_id
duplicates drop Ctnm_id, force
save 3, replace
********
use 1, clear
merge m:1 y Prvcnm_id using 2, nogen keep(1 3)
merge m:1 Ctnm_id using 3, nogen keep(1 3)
drop if hp == .
drop if lo == .
replace w = 338371 if w == 3338371
cap drop *_A
foreach Q of var w lnL{
su `Q'
loc max = `r(max)'
loc min = `r(min)'
bys Ctnm_id: ipolate `Q' y, g(`Q'_A) epolate
replace `Q' = `Q'_A if `Q' == .
replace `Q&#39; = . if `Q&#39; < `min&#39;
replace `Q&#39; = . if `Q&#39; > `max&#39;
drop `Q&#39;_A
}
foreach Q of var w lnL{
bys y Prvcnm_id: egen `Q&#39;m = mean(`Q&#39;)
replace `Q&#39; = `Q&#39;m if `Q&#39; == .
drop `Q&#39;m
}
drop if lnL == .
drop if w == .
order y *id P* C* A
su
save 4, replace得到:
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
year | 7,262 2009.49 6.330745 1999 2020
Ctnm_id | 7,262 399615.8 148210.7 110000 654300
Prvcnm_id | 7,262 398625.7 147953.3 110000 650000
Prvcnm | 0
Ctnm | 0
-------------+---------------------------------------------------------
Cttyp | 0
AreaName | 0
w | 7,262 35141.97 29794.98 5.4566 969641
lnL | 7,262 14.87672 .9296275 5.298317 17.34665
hp | 7,262 4128.585 3062.344 729 42684
-------------+---------------------------------------------------------
lo | 7,262 111.9591 9.437759 75.99381 131.2723
la | 7,262 32.98874 6.836195 16.83274 51.42405现在开始自身笛卡尔积运算,参考:
use 4, clear
ren lo λM
ren la ωM
keep Ctnm_id y *M
save 5, replace
********
forv y = 1999/2020{
use 5, clear
keep if y == `y&#39;
save 5_y, replace
**
use 5_y, clear
su y
loc n1 = `r(N)&#39;
expand `n1&#39;
sort C
cap drop flag
g flag = _n
save 5_扩, replace
**
use 5_y, clear
loc n1_ = `n1&#39;-1
forv i = 1/`n1_&#39;{
append using 5_y
}
cap drop flag
g flag = _n
ren Ctnm_id Ctnm_id2
ren λM λC
ren ωM ωC
save 5_扩2, replace
**
use 5_扩, clear
merge 1:1 flag using 5_扩2, nogen keep(1 3)
save `y&#39;
}
********
use 1999, clear
forv y = 2000/2020{
append using `y&#39;
}
drop if Ctnm_id == Ctnm_id2
sca RAD = (40075.04/360)*(180/_pi)
g α = acos(sin(ωC) * sin(ωM) + cos(ωC) * cos(ωM) * cos(λC-λM))
g DIST = RAD * (_pi/2 - atan(cos(α)/sqrt(1-(cos(α))^2)))
g Dn2 = D^(-2)
su
kdensity DIST
save 6, replace得到:
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
year | 2,390,304 2009.481 6.316803 1999 2020
Ctnm_id | 2,390,304 399660.7 148223.7 110000 654300
λM | 2,390,304 111.9548 9.44242 75.99381 131.2723
ωM | 2,390,304 32.99006 6.836107 16.83274 51.42405
flag | 2,390,304 54529.76 31524.04 2 110889
-------------+---------------------------------------------------------
Ctnm_id2 | 2,390,304 399660.7 148223.7 110000 654300
λC | 2,390,304 111.9548 9.44242 75.99381 131.2723
ωC | 2,390,304 32.99006 6.836107 16.83274 51.42405
α | 2,390,304 1.565197 .7290222 .0031313 3.136759
DIST | 2,390,304 9983.048 4649.806 19.97214 20006.69
-------------+---------------------------------------------------------
Dn2 | 2,390,304 1.58e-07 .000011 2.50e-09 .002507

进一步合并有:
use 6, clear
keep y C* Dn2
merge m:1 y Ctnm_id using 4, nogen keep(1 3)
save 7, replace
********
use 4, clear
foreach Q of var *{
ren `Q&#39; `Q&#39;2
}
ren y year
save 4_2, replace
********
use 7, clear
merge m:1 y Ctnm_id2 using 4_2, nogen keep(1 3)
g Lflowij = lnL * ln(w2 - w + 2) * ln(hp2 - hp + 2) * Dn2
replace Lf = . if Lf < 0
order y *id* P* C* A*
su
save 7, replace得到:
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
year | 2,390,304 2009.481 6.316803 1999 2020
Ctnm_id | 2,390,304 399660.7 148223.7 110000 654300
Ctnm_id2 | 2,390,304 399660.7 148223.7 110000 654300
Prvcnm_id | 2,390,304 398670.3 147965.9 110000 650000
Prvcnm_id2 | 2,390,304 398670.3 147965.9 110000 650000
-------------+---------------------------------------------------------
Prvcnm | 0
Prvcnm2 | 0
Ctnm | 0
Cttyp | 0
Ctnm2 | 0
-------------+---------------------------------------------------------
Cttyp2 | 0
AreaName | 0
AreaName2 | 0
Dn2 | 2,390,304 1.58e-07 .000011 2.50e-09 .002507
w | 2,390,304 35070.14 29689.84 5.4566 969641
-------------+---------------------------------------------------------
lnL | 2,390,304 14.87645 .929755 5.298317 17.34665
hp | 2,390,304 4121.804 3051.37 729 42684
lo | 2,390,304 111.9548 9.44242 75.99381 131.2723
la | 2,390,304 32.99006 6.836107 16.83274 51.42405
w2 | 2,390,304 35070.14 29689.84 5.4566 969641
-------------+---------------------------------------------------------
lnL2 | 2,390,304 14.87645 .929755 5.298317 17.34665
hp2 | 2,390,304 4121.804 3051.37 729 42684
lo2 | 2,390,304 111.9548 9.44242 75.99381 131.2723
la2 | 2,390,304 32.99006 6.836107 16.83274 51.42405
Lflowij | 698,836 .0001604 .0148496 0 3.392351数据集7就可以用于城市对的数据集匹配了。进一步聚合:
ue 7, clear
keep y C* P* Lflowij
drop *2
bys Ctnm_id y: egen Lflowi = sum(Lflowij)
drop Lflowij
duplicates drop
su
tabstat Lf, by(y) s(N mean sd min p25 p50 p75 max) c(s)
ren Lflowi Lflowi_out
save 地区劳动力流动(流出), replace
********
use 7, clear
keep y C* P* Lflowij
drop Ctnm_id Prvcnm_id Prvcnm Ctnm Cttyp
bys Ctnm_id2 y: egen Lflowi = sum(Lflowij)
drop Lflowij
duplicates drop
su
tabstat Lf, by(y) s(N mean sd min p25 p50 p75 max) c(s)
ren Ctnm_id Ctnm_id
ren Lflowi Lflowi_in
save 地区劳动力流动(流入), replace
********
use 4, clear
merge 1:1 y Ctnm_id using 地区劳动力流动(流出), nogen keep(1 3) keepus(Lf*)
merge 1:1 y Ctnm_id using 地区劳动力流动(流入), nogen keep(1 3) keepus(Lf*)
g Lflowi_netout = Lflowi_out - Lflowi_in
su
list in 1/5
tabstat *netout, by(y) s(N mean sd min p25 p50 p75 max) c(s)
kdensity Lflowi_netout
#delimit ;
graph box Lflowi_netout,
asyvars over(year) legend(row(1) title(&#34;年份&#34;))
scale(0.6) title(&#34;地区劳动力净流出&#34;)
blabel(bar) ytitle(&#34;&#34;)
subtitle(&#34;作图:Mr Figurant&#34;)
note(&#34;数据来源:国泰安数据库&#34;)
;
#delimit cr
save 地区劳动力流动_汇总得到结果
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
year | 7,262 2009.49 6.330745 1999 2020
Ctnm_id | 7,262 399615.8 148210.7 110000 654300
Prvcnm_id | 7,262 398625.7 147953.3 110000 650000
Prvcnm | 0
Ctnm | 0
-------------+---------------------------------------------------------
Cttyp | 0
AreaName | 0
w | 7,262 35141.97 29794.98 5.4566 969641
lnL | 7,262 14.87672 .9296275 5.298317 17.34665
hp | 7,262 4128.585 3062.344 729 42684
-------------+---------------------------------------------------------
lo | 7,262 111.9591 9.437759 75.99381 131.2723
la | 7,262 32.98874 6.836195 16.83274 51.42405
Lflowi_out | 7,262 .0154344 .1460925 0 3.407678
Lflowi_in | 7,262 .0154344 .1482686 0 3.434405
Lflowi_net~t | 7,262 -9.71e-12 .2088077 -3.43418 3.404605
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| year Ctnm_id Prvcnm~d Prvcnm Ctnm Cttyp AreaName w lnL hp lo la Lfl~_out Lflowi~n Lflo~tout |
|-----------------------------------------------------------------------------------------------------------------------------------------------------|
1. | 1999 110000 110000 北京市 北京市 直辖市 北京市 14054 16.21322 4787 116.7248 39.9049 0 .0124617 -.0124617 |
2. | 1999 120000 120000 天津市 天津市 直辖市 天津市 11123 16.02397 2157 117.2013 39.08507 .0000738 .0233874 -.0233135 |
3. | 1999 130100 130000 河北省 石家庄市 地级市 石家庄市 7938 15.98456 1309 114.5148 38.04232 .0010292 .0016666 -.0006374 |
4. | 1999 130200 130000 河北省 唐山市 地级市 唐山市 7288 15.75281 1309 118.1802 39.63045 .0029117 .0047577 -.001846 |
5. | 1999 130300 130000 河北省 秦皇岛市 地级市 秦皇岛市 8314 14.78629 1309 119.5178 39.88932 .0008989 .0033322 -.0024333 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
Summary for variables: Lflowi_netout
Group variable: year (年度标识)
year | N Mean SD Min p25 p50 p75 Max
---------+--------------------------------------------------------------------------------
1999 | 329 1.74e-11 .0277408 -.1611705 -.0042862 .0007522 .0042626 .1888814
2000 | 329 1.18e-10 .0282896 -.1665387 -.0043717 .0008296 .0041744 .1989618
2001 | 330 2.25e-11 .0240936 -.1472842 -.004249 .0009071 .0042067 .2016368
2002 | 331 -9.09e-11 .1324039 -1.681809 -.0038591 .001046 .0043945 1.654296
2003 | 330 -2.42e-10 .1584072 -2.015784 -.0039192 .0012517 .0048261 1.986699
2004 | 331 -3.45e-10 .1691377 -2.158661 -.0029372 .0011606 .0044124 2.126516
2005 | 331 -1.64e-10 .195416 -2.509623 -.0029423 .0008945 .0043134 2.456881
2006 | 332 -4.40e-10 .2019867 -2.598199 -.0027045 .0005442 .0040537 2.545371
2007 | 332 -6.54e-10 .2119776 -2.72631 -.0033691 .0006421 .0045685 2.674036
2008 | 332 -1.09e-10 .2135474 -2.736552 -.0036021 .0002347 .0057832 2.689161
2009 | 327 5.68e-10 .2223687 -2.837992 -.0040826 .0003055 .0055301 2.781558
2010 | 322 3.21e-10 .2273346 -2.880239 -.0037451 .0000816 .0052841 2.824059
2011 | 332 4.42e-10 .2260708 -2.894592 -.0045961 -.0000542 .0057061 2.846542
2012 | 333 -5.73e-10 .2323528 -2.981497 -.004235 .0001314 .0052828 2.93519
2013 | 333 8.93e-10 .2397354 -3.072487 -.0043471 .0005724 .005785 3.035283
2014 | 333 2.11e-10 .2452259 -3.146704 -.00491 .0012048 .0060716 3.101878
2015 | 333 -1.13e-10 .2427422 -3.126897 -.0028679 .0007746 .0044373 3.077749
2016 | 333 3.80e-10 .2419926 -3.11167 -.0032849 .0006465 .0051406 3.065475
2017 | 333 -5.95e-10 .2526476 -3.247251 -.003589 .0006167 .0052115 3.193555
2018 | 333 1.45e-10 .260378 -3.338798 -.0028529 .0010683 .0046817 3.288229
2019 | 330 7.28e-10 .2648793 -3.382291 -.0050669 .0016887 .0062694 3.344208
2020 | 313 -7.57e-10 .2769957 -3.43418 -.0045591 .0017511 .0068315 3.404605
---------+--------------------------------------------------------------------------------
Total | 7262 -9.71e-12 .2088077 -3.43418 -.0038551 .0007395 .0049769 3.404605
------------------------------------------------------------------------------------------


(完) |
|