Nincs leírás

Dimitri Korsch 8aab0c584a added a README file 5 éve
utils be7c8b4792 initial commit 5 éve
.gitignore 8aab0c584a added a README file 5 éve
README.md 8aab0c584a added a README file 5 éve
create_annotations.py be7c8b4792 initial commit 5 éve
postprocess_folders.py be7c8b4792 initial commit 5 éve
postprocess_image_names.py be7c8b4792 initial commit 5 éve
remove_duplicates.py be7c8b4792 initial commit 5 éve
requirements.txt be7c8b4792 initial commit 5 éve
run.py 8aab0c584a added a README file 5 éve
test_reading.py be7c8b4792 initial commit 5 éve

README.md

Google Image Crawler

Installation:

Requires >=python3.6 and pip!

pip install -r requirements.txt

Usage:

Given a file (queries.txt) with following search queries:

001.Black_footed_Albatross
002.Laysan_Albatross
003.Sooty_Albatross
004.Groove_billed_Ani
# 005.Crested_Auklet
006.Least_Auklet
007.Parakeet_Auklet
# 008.Rhinoceros_Auklet
009.Brewer_Blackbird
010.Red_winged_Blackbird
011.Rusty_Blackbird
012.Yellow_headed_Blackbird
013.Bobolink
014.Indigo_Bunting
...

you can use the main script to download images for each line (excluding lines beginning with a #!) of the query file:

python run.py queries.txt -o downloads -l 20

See python run.py --help for more argument options and its documentation.

Notes about query strings:

  • everything before the first . is removed (something.query, query and 003.query are handled equally)
  • all _ are replaced with (some_query and some query are handled equally)
  • all capitals are converted to lower case (SOME QUERY, Some Query and some query are handled equally)