Pandas 주요 기능

Notice

Recent Posts

Recent Comments

Link

관리 메뉴

Ramanu

데이터

Ramanu 2020. 11. 30. 12:54

1. data.drop_duplicates() : 데이터 중복 제거

2. data.value_counts() : 데이터의 각 value 개수

3. pd.DataFrame(columns=["x", "y"]) : Column이 x, y인 frame 생성

4. data.loc[idx] : 데이터 frame에 맞게 index가 idx인 데이터 추가

5. data.sort_values(by="x") : Column x를 중심으로 sorting

6. data.merge(data1, data2, how='중심(left or right)', on = 'x')

# 만약 Column이 다를 경우 data.merge(data1, data2, how='중심(left or right)', left_on='x', right_on='y')

7. pd.Timestamp('2020-11-30') + pd.DataOffset(seconds=10) : 2020년 11월 30일 기준으로 10초 후의 시간 값

# seconds, minutes, hours, days, months 등...

8. data['date'].astype('datetime64[ns]') : 데이터의 date column type을 datetime64[ns]로 변경

번외)

- pickle은 단일 차원 외의 다차원의 데이터를 저장할 때 유용하다.

import pickle 

with open('./data.pkl', 'rb') as f:
    data = pickle.load(f)
    
with open('./data.pkl', 'wb') as f:
    pickle.dump(data, f)

dictionary, list 등 사용 가능하다.

Python Multiprocessing (0)	2020.11.30
python dictionary key값 중심으로 정렬 후 dictionary로 반환 (0)	2020.11.10
python multiprocessing Manager pickle dump, load error (0)	2020.11.10
Pandas isin (0)	2020.11.02
Pandas read_csv, to_csv (0)	2020.10.30

'데이터' Related Articles

Comments