在日常工作中,按照特定值排序分类变量(Reorder categorical variable by specified sort criterion)的场景非常常见。这里介绍一个Stata社区命令myaxis来便捷实现这一功能。
安装Stata社区命令myaxis:
ssc install myaxis,replace
读取示例数据:
. sysuse auto, clear (1978 Automobile Data)
查看分类变量(rep78)的排序,命令如下:
. tab rep78 Repair | Record 1978 | Freq. Percent Cum. ------------+----------------------------------- 1 | 2 2.90 2.90 2 | 8 11.59 14.49 3 | 30 43.48 57.97 4 | 18 26.09 84.06 5 | 11 15.94 100.00 ------------+----------------------------------- Total | 69 100.00 . tab rep78, sum(mpg) Repair | Summary of Mileage (mpg) Record 1978 | Mean Std. Dev. Freq. ------------+------------------------------------ 1 | 21 4.2426407 2 2 | 19.125 3.7583241 8 3 | 19.433333 4.1413252 30 4 | 21.666667 4.9348699 18 5 | 27.363636 8.7323849 11 ------------+------------------------------------ Total | 21.289855 5.8664085 69
场景一,如果想要根据分类变量(rep78)的频数降序排列,命令如下:
(myaxis命令如果不加descending选项,默认就是升序(ascending)排列)
. myaxis wanted1=rep78, sort(count) descending . tab Lian1 Repair | Record 1978 | Freq. Percent Cum. ------------+----------------------------------- 3 | 30 43.48 43.48 4 | 18 26.09 69.57 5 | 11 15.94 85.51 2 | 8 11.59 97.10 1 | 2 2.90 100.00 ------------+----------------------------------- Total | 69 100.00 //也可使用社区命令fre查看: . fre Lian1 wanted1 -- Repair Record 1978 ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 1 3 | 30 40.54 43.48 43.48 2 4 | 18 24.32 26.09 69.57 3 5 | 11 14.86 15.94 85.51 4 2 | 8 10.81 11.59 97.10 5 1 | 2 2.70 2.90 100.00 Total | 69 93.24 100.00 Missing . | 5 6.76 Total | 74 100.00 -----------------------------------------------------------
场景二,如果想要根据分类变量(rep78)按每组间连续变量(mpg)的均值降序排列,命令如下:
(myaxis命令如果不加descending选项,默认就是升序(ascending)排列)
. myaxis Lian2=rep78, sort(mean mpg) descending . tab Lian2, sum(mpg) Repair | Summary of Mileage (mpg) Record 1978 | Mean Std. Dev. Freq. ------------+------------------------------------ 5 | 27.363636 8.7323849 11 4 | 21.666667 4.9348699 18 1 | 21 4.2426407 2 3 | 19.433333 4.1413252 30 2 | 19.125 3.7583241 8 ------------+------------------------------------ Total | 21.289855 5.8664085 69 //也可使用tabstat命令查看: . tabstat mpg, stat(mean sd count) by(Lian2) Summary for variables: mpg by categories of: wanted2 (Repair Record 1978) Lian2 | mean sd N --------+------------------------------ 5 | 27.36364 8.732385 11 4 | 21.66667 4.93487 18 1 | 21 4.242641 2 3 | 19.43333 4.141325 30 2 | 19.125 3.758324 8 --------+------------------------------ Total | 21.28986 5.866408 69 ---------------------------------------
命令说明:
最低版本:Stata version 8.2
发布日期:2021年03月19日
程序作者:Nicholas J. Cox, Durham University
联系邮箱:[email protected]