Python | Pandas | Operating on Data

对于一元运算，通用函数将在输出结果中保留行索引和列标签；
对于二元运算，通用函数将会在自动对齐索引后进行运算。

1. 一元运算：保留索引

df = pd.DataFrame(np.random.randint(0,100,(3,4)), columns=['A','B','C','D'])
df

	A	B	C	D
0	25	73	54	82
1	74	23	96	47
2	32	23	87	39

np.sin(df * np.pi / 4)

	A	B	C	D
0	7.071068e-01	0.707107	-1.000000e+00	1.000000
1	1.000000e+00	-0.707107	-2.939152e-15	-0.707107
2	-9.797174e-16	-0.707107	-7.071068e-01	-0.707107

2. 二元运算：索引对齐

# Series
# 先进行索引合并操作，求索引的并集，再根据索引进行对齐，对齐后进行运算
area = pd.Series({'Alaska':1723337, 'Texas':695662, 'California':423967}, name = 'area')
population = pd.Series({'Alaska':38332521, 'Texas':26448193, 'New York':19651127}, name = 'population')
population / area

Alaska        22.243195
California          NaN
New York            NaN
Texas         38.018740
dtype: float64

# DataFrame
# 合并，对齐，运算
A = pd.DataFrame(np.random.randint(0,10,(2,3)), columns=list('ABD'))
B = pd.DataFrame(np.random.randint(0,10,(3,3)), columns=list('BCA'))
print(A)
print("="*20)
print(B)
print("="*20)
print(A+B)

   A  B  D
0  3  8  3
1  5  7  5
====================
   B  C  A
0  0  6  0
1  2  6  3
2  5  6  7
====================
     A    B   C   D
0  3.0  8.0 NaN NaN
1  8.0  9.0 NaN NaN
2  NaN  NaN NaN NaN

# 可以使用运算符的通用函数形式，设置fill_value参数：一方有值就不会NaN
A.add(B, fill_value=0)

	A	B	C	D
0	3.0	8.0	6.0	3.0
1	8.0	9.0	6.0	5.0
2	7.0	5.0	6.0	NaN

3. Series和DataFrame的混合运算

DataFrame和Series的运算规则，与二维数组和一维数组的运算规则相同。
默认按行计算。需要按列计算，则需要利用运算符方法，设置axis参数，axis=0。