Pandas 실전 연습
아마존 사이트에서 구입한 이력 데이터를 Pandas로 가져와 분석해 봅시다!
1. 데이터 로드 및 확인
import pandas as pd
df = pd.read_csv('ecommerse.csv')
2. 데이터 조사
df.head()
Address | Lot | AM or PM | Browser Info | Company | Credit Card | CC Exp Date | CC Security Code | CC Provider | Job | IP Address | Language | Purchase Price | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 16629 Pace Camp Apt. 448\nAlexisborough, NE 77... | 46 in | PM | Opera/9.56.(X11; Linux x86_64; sl-SI) Presto/2... | Martinez-Herman | 6011929061123406 | 02/20 | 900 | JCB 16 digit | pdunlap@yahoo.com | Scientist, product/process development | 149.146.147.205 | el | 98.14 |
1 | 9374 Jasmine Spurs Suite 508\nSouth John, TN 8... | 28 rn | PM | Opera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr... | Fletcher, Richards and Whitaker | 3337758169645356 | 11/18 | 561 | Mastercard | anthony41@reed.com | Drilling engineer | 15.160.41.51 | fr | 70.73 |
2 | Unit 0065 Box 5052\nDPO AP 27450 | 94 vE | PM | Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ... | Simpson, Williams and Pham | 675957666125 | 08/19 | 699 | JCB 16 digit | amymiller@morales-harrison.com | Customer service manager | 132.207.160.22 | de | 0.95 |
3 | 7780 Julia Fords\nNew Stacy, WA 45798 | 36 vm | PM | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ... | Williams, Marshall and Buchanan | 6011578504430710 | 02/24 | 384 | Discover | brent16@olson-robinson.info | Drilling engineer | 30.250.74.19 | es | 78.04 |
4 | 23012 Munoz Drive Suite 337\nNew Cynthia, TX 5... | 20 IE | AM | Opera/9.58.(X11; Linux x86_64; it-IT) Presto/2... | Brown, Watson and Andrews | 6011456623207998 | 10/25 | 678 | Diners Club / Carte Blanche | christopherwright@gmail.com | Fine artist | 24.140.33.94 | es | 77.82 |
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Address 10000 non-null object
1 Lot 10000 non-null object
2 AM or PM 10000 non-null object
3 Browser Info 10000 non-null object
4 Company 10000 non-null object
5 Credit Card 10000 non-null int64
6 CC Exp Date 10000 non-null object
7 CC Security Code 10000 non-null int64
8 CC Provider 10000 non-null object
9 Email 10000 non-null object
10 Job 10000 non-null object
11 IP Address 10000 non-null object
12 Language 10000 non-null object
13 Purchase Price 10000 non-null float64
dtypes: float64(1), int64(2), object(11)
memory usage: 1.1+ MB
# 전처리 (int 제거)
df['Credit Card'] = df['Credit Card'].astype(object)
# 전처리 (int 제거)
df['CC Security Code'] = df['CC Security Code'].astype(object)
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Address 10000 non-null object
1 Lot 10000 non-null object
2 AM or PM 10000 non-null object
3 Browser Info 10000 non-null object
4 Company 10000 non-null object
5 Credit Card 10000 non-null object
6 CC Exp Date 10000 non-null object
7 CC Security Code 10000 non-null object
8 CC Provider 10000 non-null object
9 Email 10000 non-null object
10 Job 10000 non-null object
11 IP Address 10000 non-null object
12 Language 10000 non-null object
13 Purchase Price 10000 non-null float64
dtypes: float64(1), object(13)
memory usage: 1.1+ MB
총 구매 금액의 평균을 구해 보세요
df['Purchase Price'].mean()
50.34730200000025
웹사이트 언어로 영어(en)를 쓰는 사람은 몇 명인가요?
# 대소문자를 전처리하여 검색해야 완벽함
len(df[df['Language'].str.lower()=='en'])
1098
고객들은 어떠한 직업을 가지고 있나요?
df['Job'].unique()
array(['Scientist, product/process development', 'Drilling engineer',
'Customer service manager', 'Fine artist', 'Fish farm manager',
'Dancer', 'Event organiser', 'Financial manager',
'Forensic scientist', 'Development worker, community',
'Diagnostic radiographer', 'Surveyor, quantity',
'Accountant, chartered public finance', 'Acupuncturist',
'Retail manager', 'Therapist, art', 'Designer, jewellery',
'Photographer', 'Designer, interior/spatial',
'Public relations officer', 'Presenter, broadcasting',
'Field seismologist', 'Musician',
'Training and development officer', "Barrister's clerk",
'Careers adviser', 'Scientist, research (life sciences)',
'Recycling officer', 'Fisheries officer', 'Sales executive',
'Civil Service fast streamer', 'Theatre stage manager',
'Therapist, music', 'Lecturer, further education',
'Animal technologist', 'Psychologist, occupational',
'Music therapist', 'Minerals surveyor',
'Tourist information centre manager', 'Tax inspector',
'Buyer, industrial', 'Purchasing manager', 'Heritage manager',
'Games developer', 'Facilities manager', 'Chemist, analytical',
'Horticulturist, amenity', 'Mechanical engineer',
'Clinical research associate', 'Technical author',
'Radio broadcast assistant', 'Engineer, broadcasting (operations)',
'Clothing/textile technologist', 'Holiday representative',
'Housing manager/officer', 'Teacher, special educational needs',
'Engineer, agricultural', 'Buyer, retail',
'Doctor, general practice', 'Hotel manager',
'Engineer, civil (consulting)', 'General practice doctor',
'Jewellery designer', 'Therapist, speech and language',
'Operational researcher', 'Graphic designer',
'Editorial assistant', 'Financial adviser', 'Paramedic',
'Secretary, company', 'Paediatric nurse',
'Data processing manager', 'Print production planner',
'Engineer, materials', 'Economist', 'Designer, furniture',
'Human resources officer', 'Administrator',
'Teacher, secondary school', 'Video editor',
'Chartered certified accountant',
'Teacher, English as a foreign language', 'Airline pilot',
'Engineer, land', 'Legal secretary', 'Administrator, arts',
'Dealer', 'Town planner', 'Travel agency manager',
'Engineer, water', 'Lecturer, higher education',
'Higher education careers adviser', 'Intelligence analyst',
'Technical sales engineer', 'Dietitian',
'International aid/development worker',
'Learning disability nurse', 'Geneticist, molecular',
'Optometrist', 'Building surveyor', 'Sports therapist',
'Geophysical data processor', 'Multimedia programmer',
'Research scientist (maths)', 'Social worker', 'Camera operator',
'Interpreter', 'Furniture designer', 'Commercial horticulturist',
'Surgeon', 'Psychotherapist, dance movement',
'Information systems manager', 'Music tutor',
'Merchandiser, retail', 'Insurance risk surveyor',
'Armed forces operational officer', 'Homeopath',
'Animal nutritionist', 'Tourism officer', 'Cabin crew',
'Psychiatric nurse', 'Magazine features editor',
'Chartered management accountant', 'Market researcher',
'Mudlogger', 'Textile designer', 'Restaurant manager, fast food',
'Medical laboratory scientific officer',
'Chartered public finance accountant', 'Financial trader',
'Scientist, marine', 'Environmental manager',
'Community education officer', 'Solicitor',
'Psychologist, clinical', 'Occupational therapist',
'Retail merchandiser', 'Mental health nurse', 'Ophthalmologist',
'Insurance underwriter', 'Engineer, aeronautical',
'Chartered accountant', 'Environmental health practitioner',
'Research scientist (life sciences)', 'Wellsite geologist',
'Product designer', 'Armed forces technical officer',
'Television/film/video producer', 'Water quality scientist',
'Furniture conservator/restorer', 'Radio producer',
'Programmer, multimedia', 'Scientist, biomedical',
'Psychologist, sport and exercise', 'Merchant navy officer',
'Designer, television/film set', 'Engineering geologist',
'Health and safety inspector', 'Naval architect', 'Chiropodist',
'Engineer, chemical', 'Scientist, research (medical)',
'Personal assistant', 'Retail banker',
'Advertising account planner', 'Art therapist',
'Forest/woodland manager', 'Research scientist (medical)',
'Scientific laboratory technician', 'Licensed conveyancer',
'Engineer, building services', 'Energy engineer', 'Teacher, music',
'Oncologist', 'Brewing technologist', 'Community arts worker',
'Therapist, drama', 'Water engineer', 'Software engineer',
'Geochemist', 'Designer, industrial/product',
'Nature conservation officer',
'Production designer, theatre/television/film',
'Commercial art gallery manager', 'Exhibition designer',
'Education officer, environmental',
'Runner, broadcasting/film/video', 'Pharmacist, hospital',
'Doctor, hospital', 'Teacher, adult education', 'Sub',
'Podiatrist', 'Air broker', 'Engineer, civil (contracting)',
'Lexicographer', 'Speech and language therapist',
'Catering manager', 'Colour technologist', 'Surveyor, building',
'Agricultural consultant', 'Horticultural therapist',
'Pension scheme manager', 'Theatre manager', 'Producer, radio',
'Scientist, forensic', 'Surveyor, insurance',
'Hospital pharmacist', 'Surveyor, rural practice',
'Industrial/product designer', 'Corporate treasurer',
'Structural engineer', 'Research officer, trade union',
'Pensions consultant', 'Secretary/administrator', 'Firefighter',
'Database administrator', 'Health promotion specialist',
'Operations geologist', 'Regulatory affairs officer',
'Production assistant, radio', 'Press sub',
'Education administrator', 'Biomedical scientist', 'Tour manager',
'Midwife', 'Television floor manager', 'Mining engineer',
'Interior and spatial designer', 'Engineer, energy',
'Publishing rights manager', 'Manufacturing systems engineer',
'Designer, exhibition/display',
'Control and instrumentation engineer', 'Ergonomist',
'Scientist, research (maths)', 'Psychologist, forensic',
'Clinical scientist, histocompatibility and immunogenetics',
'Accommodation manager', 'Administrator, sports',
'Sports development officer', 'Educational psychologist',
'Social research officer, government', 'Quarry manager',
'Medical sales representative', 'Consulting civil engineer',
'Trade union research officer', 'Financial planner',
'Futures trader', 'Medical illustrator', 'Engineer, structural',
'Transport planner', 'Special educational needs teacher',
'Chiropractor', 'Therapist, sports',
'Scientist, research (physical sciences)', 'Site engineer',
'Psychologist, counselling', 'Bonds trader', 'Banker',
'Proofreader', 'Social researcher', 'Analytical chemist',
'Dramatherapist', 'Editor, commissioning', 'Sports administrator',
'Sport and exercise psychologist', 'Garment/textile technologist',
'Engineer, maintenance (IT)', 'Surveyor, building control',
'Media planner', 'Statistician', 'Food technologist',
'Chartered loss adjuster', 'Petroleum engineer',
'Engineer, maintenance', 'Television camera operator',
'Geoscientist', 'Printmaker', 'Artist', 'Hospital doctor',
'Systems analyst', 'Radiographer, therapeutic',
'Freight forwarder', 'Call centre manager', 'Theatre director',
'Occupational psychologist', 'Communications engineer',
'Land/geomatics surveyor', 'Advertising account executive',
'Civil engineer, consulting', 'Medical secretary',
'IT technical support officer', 'Advice worker',
'Visual merchandiser', 'Nurse, mental health',
'Further education lecturer', 'Investment banker, corporate',
'Bookseller', 'Engineer, production', 'Health visitor',
'Archivist', 'Recruitment consultant',
'Conservator, museum/gallery', 'Translator', 'Youth worker',
'Operational investment banker',
'Radiation protection practitioner',
'Research officer, political party', 'Make',
'Horticulturist, commercial', 'Surveyor, hydrographic',
'Engineer, communications', 'Surveyor, minerals', 'Hydrologist',
'Race relations officer', 'Quantity surveyor',
'Geologist, wellsite', 'Museum/gallery conservator',
'Community pharmacist', 'Adult guidance worker',
'Building control surveyor', 'Museum/gallery exhibitions officer',
'Ambulance person', 'Probation officer', 'Orthoptist',
'Charity officer', 'Risk analyst', 'Best boy',
'Arts administrator',
'Lighting technician, broadcasting/film/video',
'Engineer, drilling', 'Engineer, control and instrumentation',
'Medical physicist', 'Dispensing optician',
'Management consultant', 'Commercial/residential surveyor',
'English as a foreign language teacher', 'Engineer, site',
'Automotive engineer', 'Air traffic controller', 'Lawyer',
'Contracting civil engineer', 'Psychotherapist, child',
'Maintenance engineer', 'Science writer',
'Administrator, education', 'Administrator, Civil Service',
'Estate manager/land agent', 'Company secretary',
'Therapist, horticultural', 'Broadcast engineer',
'Sound technician, broadcasting/film/video',
'Private music teacher', 'Engineer, manufacturing',
'Emergency planning/management officer', 'Risk manager',
'IT consultant', 'Geographical information systems officer',
'Equities trader', 'Immigration officer',
'Therapeutic radiographer', 'Curator', 'Child psychotherapist',
'Advertising art director', 'Forensic psychologist',
'Hydrographic surveyor', 'Immunologist', 'Sports coach',
'Haematologist', 'Waste management officer',
'Conservator, furniture', 'Research officer, government',
'Agricultural engineer', 'IT sales professional',
'Armed forces logistics/support/administrative officer',
'Field trials officer', 'Psychotherapist', 'Air cabin crew',
'Psychiatrist', 'Illustrator', 'Marine scientist',
'Multimedia specialist', 'Cytogeneticist',
'Occupational hygienist', 'Charity fundraiser',
'Planning and development surveyor', 'Counselling psychologist',
'Physicist, medical', 'TEFL teacher',
'Research scientist (physical sciences)', 'Fashion designer',
'Engineer, petroleum', 'Tree surgeon',
'Corporate investment banker', 'Public house manager',
'Designer, textile', 'Public librarian',
'Arts development officer', 'Actor', 'Hydrogeologist',
'Production assistant, television', 'Embryologist, clinical',
'Press photographer', 'Marketing executive',
'Television production assistant', 'Early years teacher',
'Landscape architect', 'Librarian, academic', 'Web designer',
'Health service manager', 'Clinical psychologist',
'Exercise physiologist', 'Electrical engineer',
'Psychologist, prison and probation services',
'Journalist, magazine', 'Dance movement psychotherapist',
'Animator', 'Legal executive', 'Plant breeder/geneticist',
'Manufacturing engineer', 'Microbiologist', 'Biochemist, clinical',
'Broadcast presenter', 'Ranger/warden',
'Programme researcher, broadcasting/film/video', 'Dentist',
'Aid worker', 'Publishing copy', 'Environmental education officer',
'Librarian, public', 'Phytotherapist', 'Nurse, adult',
'Programmer, systems', 'Scientist, physiological',
'Secondary school teacher', 'Media buyer', 'Trade mark attorney',
'Editor, film/video', 'Careers information officer',
'Retail buyer', 'Ship broker', 'Lobbyist', 'Location manager',
'Surveyor, mining', 'Systems developer', 'Optician, dispensing',
'Chemical engineer', 'Oceanographer', 'Patent attorney',
'Passenger transport manager', 'Glass blower/designer',
'Psychologist, educational', 'Education officer, museum',
'Patent examiner', 'Logistics and distribution manager',
'Trading standards officer', 'Seismic interpreter',
'Local government officer', 'Clinical cytogeneticist',
'Biomedical engineer', 'Diplomatic Services operational officer',
'Land', 'Surveyor, planning and development', 'Farm manager',
'Claims inspector/assessor', 'Warden/ranger',
'Civil engineer, contracting', 'Clinical molecular geneticist',
'Surveyor, land/geomatics', 'Therapist, nutritional',
'Historic buildings inspector/conservation officer',
'English as a second language teacher', 'Materials engineer',
'Insurance broker', 'Ecologist', 'Production manager',
'Veterinary surgeon', 'Pilot, airline', "Nurse, children's",
'Astronomer', 'Architect', 'Conservation officer, nature',
'Personnel officer', 'Network engineer', 'Investment analyst',
'Writer', 'Designer, fashion/clothing', 'Theme park manager',
'Engineer, electrical', 'Art gallery manager', 'Office manager',
'Pharmacologist', 'Meteorologist', 'Estate agent',
"Politician's assistant", 'Environmental consultant', 'Osteopath',
'Amenity horticulturist', 'Physiological scientist',
'Public relations account executive',
'Senior tax professional/tax inspector', 'Arboriculturist',
'Radiographer, diagnostic', 'Engineer, automotive',
'Editor, magazine features', 'Sales professional, IT',
'Commissioning editor', 'Ceramics designer',
'Scientist, clinical (histocompatibility and immunogenetics)',
'Geologist, engineering', 'Designer, multimedia',
'Engineer, manufacturing systems', 'Clinical embryologist',
'Medical technical officer', 'Teacher, primary school',
'Set designer', 'Producer, television/film/video',
'Education officer, community',
'Conservation officer, historic buildings',
'Insurance account manager', 'Financial controller',
'Film/video editor', 'Museum/gallery curator',
'Accounting technician', 'Stage manager', 'Physiotherapist',
'Sales promotion account executive', 'Horticultural consultant',
'Journalist, broadcasting', 'Health and safety adviser',
'Building services engineer',
'Product/process development scientist', 'Metallurgist',
'Audiological scientist', 'Designer, blown glass/stained glass',
'Tax adviser', 'Industrial buyer',
'Government social research officer',
'Outdoor activities/education manager', 'Solicitor, Scotland',
'Conference centre manager', 'Health physicist',
'Pharmacist, community', 'Applications developer',
'Public affairs consultant', 'Prison officer',
'Production engineer', 'Actuary', 'Museum education officer',
'Fitness centre manager', 'Records manager',
'Journalist, newspaper', 'Computer games developer',
'Toxicologist', 'Counsellor', 'Therapist, occupational',
'Accountant, chartered certified', 'Insurance claims handler',
'Programmer, applications', 'Accountant, chartered management',
'Broadcast journalist', 'Energy manager', 'Barrister',
'Scientist, audiological', 'Designer, graphic', 'Cartographer',
'Community development worker', 'Engineer, mining',
'Special effects artist', 'Gaffer', 'Civil Service administrator',
'Armed forces training and education officer', 'Quality manager',
'Engineer, electronics', 'Development worker, international aid',
'Restaurant manager', 'Architectural technologist',
'Academic librarian', 'Fast food restaurant manager',
'Rural practice surveyor', 'Designer, ceramics/pottery',
'Police officer', 'Soil scientist', 'Technical brewer',
'Administrator, local government', 'Copywriter, advertising',
'Volunteer coordinator', 'Magazine journalist',
'Information officer', 'Copy', 'Advertising copywriter',
'Engineer, biomedical', 'Nurse, learning disability',
'Aeronautical engineer', 'Archaeologist', 'IT trainer',
'Clinical biochemist',
'Administrator, charities/voluntary organisations', 'Pathologist',
'Geophysicist/field seismologist',
'Investment banker, operational', 'Teacher, early years/pre',
'Electronics engineer', 'Exhibitions officer, museum/gallery',
'Nutritional therapist', 'Teaching laboratory technician',
'Herbalist', 'Leisure centre manager', 'Higher education lecturer',
'Newspaper journalist', 'Accountant, chartered',
'Loss adjuster, chartered', 'Surveyor, commercial/residential',
'Warehouse manager', 'Scientist, water quality',
'Primary school teacher',
'Chartered legal executive (England and Wales)',
'Telecommunications researcher', 'Equality and diversity officer',
'Learning mentor', 'Financial risk analyst',
'Engineer, technical sales', 'Adult nurse'], dtype=object)
어떤 직업을 가진 사람들이 가장 많은 금액을 지출했나요?
df_Job = df.groupby('Job')
df_Job['Purchase Price'].sum().sort_values(ascending=False)[:10]
Job
Dietitian 1605.30
Lawyer 1603.85
Purchasing manager 1577.97
Therapist, art 1526.31
Clinical cytogeneticist 1495.92
Research officer, political party 1488.79
Designer, jewellery 1482.20
Interior and spatial designer 1466.20
Network engineer 1421.73
Social researcher 1416.34
Name: Purchase Price, dtype: float64
가장 많은 구매(횟수)를 한 직업군은 무엇인가요?
df['Job'].value_counts()[:10]
Interior and spatial designer 31
Lawyer 30
Social researcher 28
Purchasing manager 27
Designer, jewellery 27
Research officer, political party 27
Dietitian 26
Social worker 26
Charity fundraiser 26
Special educational needs teacher 26
Name: Job, dtype: int64
bondellen@williams-garza.com 메일 주소를 가진 사용자의 구매 이력을 찾아 보세요
# 대소문자 전처리 .str.lower()를 붙여줘야 완벽함
df[df['Email'].str.lower() == 'bondellen@williams-garza.com']
Address | Lot | AM or PM | Browser Info | Company | Credit Card | CC Exp Date | CC Security Code | CC Provider | Job | IP Address | Language | Purchase Price | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1234 | 2470 Maria Manors Suite 185\nJoneshaven, MN 87251 | 82 UX | PM | Mozilla/5.0 (X11; Linux x86_64; rv:1.9.5.20) G... | Cole, King and Bowers | 4926535242672853 | 09/21 | 188 | American Express | bondellen@williams-garza.com | Planning and development surveyor | 159.182.3.50 | ru | 77.88 |
MasterCard를 사용했으며 50달러 이상 구매한 사용자는 몇 명인가요?
# 대소문자 전처리를 안했을 경우
len(df[(df['CC Provider'] == 'MasterCard') & (df['Purchase Price'] >= 50)])
0
# 대소문자 전처리를 했을 경우
len(df[(df['CC Provider'].replace(' ','').str.lower() == 'mastercard') & (df['Purchase Price'] >= 50)])
405
윈도우를 사용하는 사용자는 몇 명인가요(Browser Info)?
len(df[df['Browser Info'].str.lower().str.contains('windows')])
4994
전체 사용자 중 윈도우, Linux, Mac 운영체제를 사용하는 사용자의 비율은 어떻게 되나요?
len(df[df['Browser Info'].str.lower().str.contains('windows')]) / len(df['Browser Info']) * 100
49.94
len(df[df['Browser Info'].str.lower().str.contains('linux')]) / len(df['Browser Info']) * 100
23.22
len(df[df['Browser Info'].str.lower().str.contains('mac')]) / len(df['Browser Info']) * 100
26.840000000000003
# 좀 더 컴퓨터 공학과스럽게 짜는 방식
total = len(df)
os = ['windows', 'linux', 'mac']
total_user = 0
for o in os:
user = len(df[df['Browser Info'].str.lower().str.contains(o)])
total_user += user
rate = user / total * 100
print(o, ':', rate)
print('total_user : %d' % total_user)
windows : 49.94
linux : 23.22
mac : 26.840000000000003
total_user : 10000
'데이터분석' 카테고리의 다른 글
[데이터 분석] sales 데이터 전처리하기 (0) | 2021.06.30 |
---|---|
[데이터 분석] sales 데이터 분석 (0) | 2021.06.30 |
[데이터 분석] 오픈 API를 통한 데이터 수집 : 행정안전부_소방서위치조회서비스 (0) | 2021.06.30 |
[데이터 분석] 사람인 채용공고 크롤링 (0) | 2021.06.30 |
데이터란 무엇인가? (0) | 2021.06.26 |
댓글