본문 바로가기
데이터분석

[데이터 분석] 아마존 ecommerse 데이터 분석

by CodingKwon 2021. 6. 30.

Pandas 실전 연습

아마존 사이트에서 구입한 이력 데이터를 Pandas로 가져와 분석해 봅시다!

1. 데이터 로드 및 확인

import pandas as pd
df = pd.read_csv('ecommerse.csv')

2. 데이터 조사

df.head()
  Address Lot AM or PM Browser Info Company Credit Card CC Exp Date CC Security Code CC Provider Email Job IP Address Language Purchase Price
0 16629 Pace Camp Apt. 448\nAlexisborough, NE 77... 46 in PM Opera/9.56.(X11; Linux x86_64; sl-SI) Presto/2... Martinez-Herman 6011929061123406 02/20 900 JCB 16 digit pdunlap@yahoo.com Scientist, product/process development 149.146.147.205 el 98.14
1 9374 Jasmine Spurs Suite 508\nSouth John, TN 8... 28 rn PM Opera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr... Fletcher, Richards and Whitaker 3337758169645356 11/18 561 Mastercard anthony41@reed.com Drilling engineer 15.160.41.51 fr 70.73
2 Unit 0065 Box 5052\nDPO AP 27450 94 vE PM Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ... Simpson, Williams and Pham 675957666125 08/19 699 JCB 16 digit amymiller@morales-harrison.com Customer service manager 132.207.160.22 de 0.95
3 7780 Julia Fords\nNew Stacy, WA 45798 36 vm PM Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ... Williams, Marshall and Buchanan 6011578504430710 02/24 384 Discover brent16@olson-robinson.info Drilling engineer 30.250.74.19 es 78.04
4 23012 Munoz Drive Suite 337\nNew Cynthia, TX 5... 20 IE AM Opera/9.58.(X11; Linux x86_64; it-IT) Presto/2... Brown, Watson and Andrews 6011456623207998 10/25 678 Diners Club / Carte Blanche christopherwright@gmail.com Fine artist 24.140.33.94 es 77.82
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Address           10000 non-null  object 
 1   Lot               10000 non-null  object 
 2   AM or PM          10000 non-null  object 
 3   Browser Info      10000 non-null  object 
 4   Company           10000 non-null  object 
 5   Credit Card       10000 non-null  int64  
 6   CC Exp Date       10000 non-null  object 
 7   CC Security Code  10000 non-null  int64  
 8   CC Provider       10000 non-null  object 
 9   Email             10000 non-null  object 
 10  Job               10000 non-null  object 
 11  IP Address        10000 non-null  object 
 12  Language          10000 non-null  object 
 13  Purchase Price    10000 non-null  float64
dtypes: float64(1), int64(2), object(11)
memory usage: 1.1+ MB
# 전처리 (int 제거)
df['Credit Card'] = df['Credit Card'].astype(object)
# 전처리 (int 제거)
df['CC Security Code'] = df['CC Security Code'].astype(object)
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Address           10000 non-null  object 
 1   Lot               10000 non-null  object 
 2   AM or PM          10000 non-null  object 
 3   Browser Info      10000 non-null  object 
 4   Company           10000 non-null  object 
 5   Credit Card       10000 non-null  object 
 6   CC Exp Date       10000 non-null  object 
 7   CC Security Code  10000 non-null  object 
 8   CC Provider       10000 non-null  object 
 9   Email             10000 non-null  object 
 10  Job               10000 non-null  object 
 11  IP Address        10000 non-null  object 
 12  Language          10000 non-null  object 
 13  Purchase Price    10000 non-null  float64
dtypes: float64(1), object(13)
memory usage: 1.1+ MB

총 구매 금액의 평균을 구해 보세요

df['Purchase Price'].mean()
50.34730200000025

웹사이트 언어로 영어(en)를 쓰는 사람은 몇 명인가요?

# 대소문자를 전처리하여 검색해야 완벽함
len(df[df['Language'].str.lower()=='en'])
1098

고객들은 어떠한 직업을 가지고 있나요?

df['Job'].unique()
array(['Scientist, product/process development', 'Drilling engineer',
       'Customer service manager', 'Fine artist', 'Fish farm manager',
       'Dancer', 'Event organiser', 'Financial manager',
       'Forensic scientist', 'Development worker, community',
       'Diagnostic radiographer', 'Surveyor, quantity',
       'Accountant, chartered public finance', 'Acupuncturist',
       'Retail manager', 'Therapist, art', 'Designer, jewellery',
       'Photographer', 'Designer, interior/spatial',
       'Public relations officer', 'Presenter, broadcasting',
       'Field seismologist', 'Musician',
       'Training and development officer', "Barrister's clerk",
       'Careers adviser', 'Scientist, research (life sciences)',
       'Recycling officer', 'Fisheries officer', 'Sales executive',
       'Civil Service fast streamer', 'Theatre stage manager',
       'Therapist, music', 'Lecturer, further education',
       'Animal technologist', 'Psychologist, occupational',
       'Music therapist', 'Minerals surveyor',
       'Tourist information centre manager', 'Tax inspector',
       'Buyer, industrial', 'Purchasing manager', 'Heritage manager',
       'Games developer', 'Facilities manager', 'Chemist, analytical',
       'Horticulturist, amenity', 'Mechanical engineer',
       'Clinical research associate', 'Technical author',
       'Radio broadcast assistant', 'Engineer, broadcasting (operations)',
       'Clothing/textile technologist', 'Holiday representative',
       'Housing manager/officer', 'Teacher, special educational needs',
       'Engineer, agricultural', 'Buyer, retail',
       'Doctor, general practice', 'Hotel manager',
       'Engineer, civil (consulting)', 'General practice doctor',
       'Jewellery designer', 'Therapist, speech and language',
       'Operational researcher', 'Graphic designer',
       'Editorial assistant', 'Financial adviser', 'Paramedic',
       'Secretary, company', 'Paediatric nurse',
       'Data processing manager', 'Print production planner',
       'Engineer, materials', 'Economist', 'Designer, furniture',
       'Human resources officer', 'Administrator',
       'Teacher, secondary school', 'Video editor',
       'Chartered certified accountant',
       'Teacher, English as a foreign language', 'Airline pilot',
       'Engineer, land', 'Legal secretary', 'Administrator, arts',
       'Dealer', 'Town planner', 'Travel agency manager',
       'Engineer, water', 'Lecturer, higher education',
       'Higher education careers adviser', 'Intelligence analyst',
       'Technical sales engineer', 'Dietitian',
       'International aid/development worker',
       'Learning disability nurse', 'Geneticist, molecular',
       'Optometrist', 'Building surveyor', 'Sports therapist',
       'Geophysical data processor', 'Multimedia programmer',
       'Research scientist (maths)', 'Social worker', 'Camera operator',
       'Interpreter', 'Furniture designer', 'Commercial horticulturist',
       'Surgeon', 'Psychotherapist, dance movement',
       'Information systems manager', 'Music tutor',
       'Merchandiser, retail', 'Insurance risk surveyor',
       'Armed forces operational officer', 'Homeopath',
       'Animal nutritionist', 'Tourism officer', 'Cabin crew',
       'Psychiatric nurse', 'Magazine features editor',
       'Chartered management accountant', 'Market researcher',
       'Mudlogger', 'Textile designer', 'Restaurant manager, fast food',
       'Medical laboratory scientific officer',
       'Chartered public finance accountant', 'Financial trader',
       'Scientist, marine', 'Environmental manager',
       'Community education officer', 'Solicitor',
       'Psychologist, clinical', 'Occupational therapist',
       'Retail merchandiser', 'Mental health nurse', 'Ophthalmologist',
       'Insurance underwriter', 'Engineer, aeronautical',
       'Chartered accountant', 'Environmental health practitioner',
       'Research scientist (life sciences)', 'Wellsite geologist',
       'Product designer', 'Armed forces technical officer',
       'Television/film/video producer', 'Water quality scientist',
       'Furniture conservator/restorer', 'Radio producer',
       'Programmer, multimedia', 'Scientist, biomedical',
       'Psychologist, sport and exercise', 'Merchant navy officer',
       'Designer, television/film set', 'Engineering geologist',
       'Health and safety inspector', 'Naval architect', 'Chiropodist',
       'Engineer, chemical', 'Scientist, research (medical)',
       'Personal assistant', 'Retail banker',
       'Advertising account planner', 'Art therapist',
       'Forest/woodland manager', 'Research scientist (medical)',
       'Scientific laboratory technician', 'Licensed conveyancer',
       'Engineer, building services', 'Energy engineer', 'Teacher, music',
       'Oncologist', 'Brewing technologist', 'Community arts worker',
       'Therapist, drama', 'Water engineer', 'Software engineer',
       'Geochemist', 'Designer, industrial/product',
       'Nature conservation officer',
       'Production designer, theatre/television/film',
       'Commercial art gallery manager', 'Exhibition designer',
       'Education officer, environmental',
       'Runner, broadcasting/film/video', 'Pharmacist, hospital',
       'Doctor, hospital', 'Teacher, adult education', 'Sub',
       'Podiatrist', 'Air broker', 'Engineer, civil (contracting)',
       'Lexicographer', 'Speech and language therapist',
       'Catering manager', 'Colour technologist', 'Surveyor, building',
       'Agricultural consultant', 'Horticultural therapist',
       'Pension scheme manager', 'Theatre manager', 'Producer, radio',
       'Scientist, forensic', 'Surveyor, insurance',
       'Hospital pharmacist', 'Surveyor, rural practice',
       'Industrial/product designer', 'Corporate treasurer',
       'Structural engineer', 'Research officer, trade union',
       'Pensions consultant', 'Secretary/administrator', 'Firefighter',
       'Database administrator', 'Health promotion specialist',
       'Operations geologist', 'Regulatory affairs officer',
       'Production assistant, radio', 'Press sub',
       'Education administrator', 'Biomedical scientist', 'Tour manager',
       'Midwife', 'Television floor manager', 'Mining engineer',
       'Interior and spatial designer', 'Engineer, energy',
       'Publishing rights manager', 'Manufacturing systems engineer',
       'Designer, exhibition/display',
       'Control and instrumentation engineer', 'Ergonomist',
       'Scientist, research (maths)', 'Psychologist, forensic',
       'Clinical scientist, histocompatibility and immunogenetics',
       'Accommodation manager', 'Administrator, sports',
       'Sports development officer', 'Educational psychologist',
       'Social research officer, government', 'Quarry manager',
       'Medical sales representative', 'Consulting civil engineer',
       'Trade union research officer', 'Financial planner',
       'Futures trader', 'Medical illustrator', 'Engineer, structural',
       'Transport planner', 'Special educational needs teacher',
       'Chiropractor', 'Therapist, sports',
       'Scientist, research (physical sciences)', 'Site engineer',
       'Psychologist, counselling', 'Bonds trader', 'Banker',
       'Proofreader', 'Social researcher', 'Analytical chemist',
       'Dramatherapist', 'Editor, commissioning', 'Sports administrator',
       'Sport and exercise psychologist', 'Garment/textile technologist',
       'Engineer, maintenance (IT)', 'Surveyor, building control',
       'Media planner', 'Statistician', 'Food technologist',
       'Chartered loss adjuster', 'Petroleum engineer',
       'Engineer, maintenance', 'Television camera operator',
       'Geoscientist', 'Printmaker', 'Artist', 'Hospital doctor',
       'Systems analyst', 'Radiographer, therapeutic',
       'Freight forwarder', 'Call centre manager', 'Theatre director',
       'Occupational psychologist', 'Communications engineer',
       'Land/geomatics surveyor', 'Advertising account executive',
       'Civil engineer, consulting', 'Medical secretary',
       'IT technical support officer', 'Advice worker',
       'Visual merchandiser', 'Nurse, mental health',
       'Further education lecturer', 'Investment banker, corporate',
       'Bookseller', 'Engineer, production', 'Health visitor',
       'Archivist', 'Recruitment consultant',
       'Conservator, museum/gallery', 'Translator', 'Youth worker',
       'Operational investment banker',
       'Radiation protection practitioner',
       'Research officer, political party', 'Make',
       'Horticulturist, commercial', 'Surveyor, hydrographic',
       'Engineer, communications', 'Surveyor, minerals', 'Hydrologist',
       'Race relations officer', 'Quantity surveyor',
       'Geologist, wellsite', 'Museum/gallery conservator',
       'Community pharmacist', 'Adult guidance worker',
       'Building control surveyor', 'Museum/gallery exhibitions officer',
       'Ambulance person', 'Probation officer', 'Orthoptist',
       'Charity officer', 'Risk analyst', 'Best boy',
       'Arts administrator',
       'Lighting technician, broadcasting/film/video',
       'Engineer, drilling', 'Engineer, control and instrumentation',
       'Medical physicist', 'Dispensing optician',
       'Management consultant', 'Commercial/residential surveyor',
       'English as a foreign language teacher', 'Engineer, site',
       'Automotive engineer', 'Air traffic controller', 'Lawyer',
       'Contracting civil engineer', 'Psychotherapist, child',
       'Maintenance engineer', 'Science writer',
       'Administrator, education', 'Administrator, Civil Service',
       'Estate manager/land agent', 'Company secretary',
       'Therapist, horticultural', 'Broadcast engineer',
       'Sound technician, broadcasting/film/video',
       'Private music teacher', 'Engineer, manufacturing',
       'Emergency planning/management officer', 'Risk manager',
       'IT consultant', 'Geographical information systems officer',
       'Equities trader', 'Immigration officer',
       'Therapeutic radiographer', 'Curator', 'Child psychotherapist',
       'Advertising art director', 'Forensic psychologist',
       'Hydrographic surveyor', 'Immunologist', 'Sports coach',
       'Haematologist', 'Waste management officer',
       'Conservator, furniture', 'Research officer, government',
       'Agricultural engineer', 'IT sales professional',
       'Armed forces logistics/support/administrative officer',
       'Field trials officer', 'Psychotherapist', 'Air cabin crew',
       'Psychiatrist', 'Illustrator', 'Marine scientist',
       'Multimedia specialist', 'Cytogeneticist',
       'Occupational hygienist', 'Charity fundraiser',
       'Planning and development surveyor', 'Counselling psychologist',
       'Physicist, medical', 'TEFL teacher',
       'Research scientist (physical sciences)', 'Fashion designer',
       'Engineer, petroleum', 'Tree surgeon',
       'Corporate investment banker', 'Public house manager',
       'Designer, textile', 'Public librarian',
       'Arts development officer', 'Actor', 'Hydrogeologist',
       'Production assistant, television', 'Embryologist, clinical',
       'Press photographer', 'Marketing executive',
       'Television production assistant', 'Early years teacher',
       'Landscape architect', 'Librarian, academic', 'Web designer',
       'Health service manager', 'Clinical psychologist',
       'Exercise physiologist', 'Electrical engineer',
       'Psychologist, prison and probation services',
       'Journalist, magazine', 'Dance movement psychotherapist',
       'Animator', 'Legal executive', 'Plant breeder/geneticist',
       'Manufacturing engineer', 'Microbiologist', 'Biochemist, clinical',
       'Broadcast presenter', 'Ranger/warden',
       'Programme researcher, broadcasting/film/video', 'Dentist',
       'Aid worker', 'Publishing copy', 'Environmental education officer',
       'Librarian, public', 'Phytotherapist', 'Nurse, adult',
       'Programmer, systems', 'Scientist, physiological',
       'Secondary school teacher', 'Media buyer', 'Trade mark attorney',
       'Editor, film/video', 'Careers information officer',
       'Retail buyer', 'Ship broker', 'Lobbyist', 'Location manager',
       'Surveyor, mining', 'Systems developer', 'Optician, dispensing',
       'Chemical engineer', 'Oceanographer', 'Patent attorney',
       'Passenger transport manager', 'Glass blower/designer',
       'Psychologist, educational', 'Education officer, museum',
       'Patent examiner', 'Logistics and distribution manager',
       'Trading standards officer', 'Seismic interpreter',
       'Local government officer', 'Clinical cytogeneticist',
       'Biomedical engineer', 'Diplomatic Services operational officer',
       'Land', 'Surveyor, planning and development', 'Farm manager',
       'Claims inspector/assessor', 'Warden/ranger',
       'Civil engineer, contracting', 'Clinical molecular geneticist',
       'Surveyor, land/geomatics', 'Therapist, nutritional',
       'Historic buildings inspector/conservation officer',
       'English as a second language teacher', 'Materials engineer',
       'Insurance broker', 'Ecologist', 'Production manager',
       'Veterinary surgeon', 'Pilot, airline', "Nurse, children's",
       'Astronomer', 'Architect', 'Conservation officer, nature',
       'Personnel officer', 'Network engineer', 'Investment analyst',
       'Writer', 'Designer, fashion/clothing', 'Theme park manager',
       'Engineer, electrical', 'Art gallery manager', 'Office manager',
       'Pharmacologist', 'Meteorologist', 'Estate agent',
       "Politician's assistant", 'Environmental consultant', 'Osteopath',
       'Amenity horticulturist', 'Physiological scientist',
       'Public relations account executive',
       'Senior tax professional/tax inspector', 'Arboriculturist',
       'Radiographer, diagnostic', 'Engineer, automotive',
       'Editor, magazine features', 'Sales professional, IT',
       'Commissioning editor', 'Ceramics designer',
       'Scientist, clinical (histocompatibility and immunogenetics)',
       'Geologist, engineering', 'Designer, multimedia',
       'Engineer, manufacturing systems', 'Clinical embryologist',
       'Medical technical officer', 'Teacher, primary school',
       'Set designer', 'Producer, television/film/video',
       'Education officer, community',
       'Conservation officer, historic buildings',
       'Insurance account manager', 'Financial controller',
       'Film/video editor', 'Museum/gallery curator',
       'Accounting technician', 'Stage manager', 'Physiotherapist',
       'Sales promotion account executive', 'Horticultural consultant',
       'Journalist, broadcasting', 'Health and safety adviser',
       'Building services engineer',
       'Product/process development scientist', 'Metallurgist',
       'Audiological scientist', 'Designer, blown glass/stained glass',
       'Tax adviser', 'Industrial buyer',
       'Government social research officer',
       'Outdoor activities/education manager', 'Solicitor, Scotland',
       'Conference centre manager', 'Health physicist',
       'Pharmacist, community', 'Applications developer',
       'Public affairs consultant', 'Prison officer',
       'Production engineer', 'Actuary', 'Museum education officer',
       'Fitness centre manager', 'Records manager',
       'Journalist, newspaper', 'Computer games developer',
       'Toxicologist', 'Counsellor', 'Therapist, occupational',
       'Accountant, chartered certified', 'Insurance claims handler',
       'Programmer, applications', 'Accountant, chartered management',
       'Broadcast journalist', 'Energy manager', 'Barrister',
       'Scientist, audiological', 'Designer, graphic', 'Cartographer',
       'Community development worker', 'Engineer, mining',
       'Special effects artist', 'Gaffer', 'Civil Service administrator',
       'Armed forces training and education officer', 'Quality manager',
       'Engineer, electronics', 'Development worker, international aid',
       'Restaurant manager', 'Architectural technologist',
       'Academic librarian', 'Fast food restaurant manager',
       'Rural practice surveyor', 'Designer, ceramics/pottery',
       'Police officer', 'Soil scientist', 'Technical brewer',
       'Administrator, local government', 'Copywriter, advertising',
       'Volunteer coordinator', 'Magazine journalist',
       'Information officer', 'Copy', 'Advertising copywriter',
       'Engineer, biomedical', 'Nurse, learning disability',
       'Aeronautical engineer', 'Archaeologist', 'IT trainer',
       'Clinical biochemist',
       'Administrator, charities/voluntary organisations', 'Pathologist',
       'Geophysicist/field seismologist',
       'Investment banker, operational', 'Teacher, early years/pre',
       'Electronics engineer', 'Exhibitions officer, museum/gallery',
       'Nutritional therapist', 'Teaching laboratory technician',
       'Herbalist', 'Leisure centre manager', 'Higher education lecturer',
       'Newspaper journalist', 'Accountant, chartered',
       'Loss adjuster, chartered', 'Surveyor, commercial/residential',
       'Warehouse manager', 'Scientist, water quality',
       'Primary school teacher',
       'Chartered legal executive (England and Wales)',
       'Telecommunications researcher', 'Equality and diversity officer',
       'Learning mentor', 'Financial risk analyst',
       'Engineer, technical sales', 'Adult nurse'], dtype=object)

어떤 직업을 가진 사람들이 가장 많은 금액을 지출했나요?

df_Job = df.groupby('Job')
df_Job['Purchase Price'].sum().sort_values(ascending=False)[:10]
Job
Dietitian                            1605.30
Lawyer                               1603.85
Purchasing manager                   1577.97
Therapist, art                       1526.31
Clinical cytogeneticist              1495.92
Research officer, political party    1488.79
Designer, jewellery                  1482.20
Interior and spatial designer        1466.20
Network engineer                     1421.73
Social researcher                    1416.34
Name: Purchase Price, dtype: float64

가장 많은 구매(횟수)를 한 직업군은 무엇인가요?

df['Job'].value_counts()[:10]
Interior and spatial designer        31
Lawyer                               30
Social researcher                    28
Purchasing manager                   27
Designer, jewellery                  27
Research officer, political party    27
Dietitian                            26
Social worker                        26
Charity fundraiser                   26
Special educational needs teacher    26
Name: Job, dtype: int64

bondellen@williams-garza.com 메일 주소를 가진 사용자의 구매 이력을 찾아 보세요

# 대소문자 전처리 .str.lower()를 붙여줘야 완벽함
df[df['Email'].str.lower() == 'bondellen@williams-garza.com']
  Address Lot AM or PM Browser Info Company Credit Card CC Exp Date CC Security Code CC Provider Email Job IP Address Language Purchase Price
1234 2470 Maria Manors Suite 185\nJoneshaven, MN 87251 82 UX PM Mozilla/5.0 (X11; Linux x86_64; rv:1.9.5.20) G... Cole, King and Bowers 4926535242672853 09/21 188 American Express bondellen@williams-garza.com Planning and development surveyor 159.182.3.50 ru 77.88

MasterCard를 사용했으며 50달러 이상 구매한 사용자는 몇 명인가요?

# 대소문자 전처리를 안했을 경우
len(df[(df['CC Provider'] == 'MasterCard') & (df['Purchase Price'] >= 50)])
0
# 대소문자 전처리를 했을 경우
len(df[(df['CC Provider'].replace(' ','').str.lower() == 'mastercard') & (df['Purchase Price'] >= 50)])
405

윈도우를 사용하는 사용자는 몇 명인가요(Browser Info)?

len(df[df['Browser Info'].str.lower().str.contains('windows')])
4994

전체 사용자 중 윈도우, Linux, Mac 운영체제를 사용하는 사용자의 비율은 어떻게 되나요?

len(df[df['Browser Info'].str.lower().str.contains('windows')]) / len(df['Browser Info']) * 100
49.94
len(df[df['Browser Info'].str.lower().str.contains('linux')]) / len(df['Browser Info']) * 100
23.22
len(df[df['Browser Info'].str.lower().str.contains('mac')]) / len(df['Browser Info']) * 100
26.840000000000003
# 좀 더 컴퓨터 공학과스럽게 짜는 방식

total = len(df)

os = ['windows', 'linux', 'mac']
total_user = 0

for o in os:
    user = len(df[df['Browser Info'].str.lower().str.contains(o)])
    total_user += user
    rate = user / total * 100
    print(o, ':', rate)

print('total_user : %d' % total_user)
windows : 49.94
linux : 23.22
mac : 26.840000000000003
total_user : 10000

 

 

댓글