parallel programming and heterogeneous computing feedback
play

Parallel Programming and Heterogeneous Computing Feedback Assignment - PowerPoint PPT Presentation

Parallel Programming and Heterogeneous Computing Feedback Assignment 2 Max Plauth, Sven Khler , Felix Eberhardt, Lukas Wenzel, and Andreas Polze Operating Systems and Middleware Group Assignment 1: Covered Topics General Concepts:


  1. Parallel Programming and Heterogeneous Computing Feedback Assignment 2 Max Plauth, Sven Köhler , Felix Eberhardt, Lukas Wenzel, and Andreas Polze Operating Systems and Middleware Group

  2. Assignment 1: Covered Topics General Concepts: ■ Foster’s Method □ Amdahl’s Law □ Shared Memory Parallelism with OpenMP: ■ Task 2.1: Heat Map □ Task 2.2: IO-bound problem and reentrancy of legacy functions □ Task 2.3: Task-Parallel workloads □ ParProg 2019 Task 2.4: Java Monitors □ Feedback Assignment 2 Sven Köhler Hardware Effects: ■ Efficient use of caching □ Chart 2

  3. Parsum 1 ./heatmap ParProg 2019 Feedback Assignment 2 Sven Köhler Chart 3

  4. Good Idea or Bad Idea? #ifdef WITH_OMP #pragma omp parallel for default(none) shared(heatmaps) #endif for (auto row = 1; row < height - 1; ++row) { for (auto col = 1; col < width - 1; ++col) { /* ... */ } } K No need to mask omp-pragmas (unless you have functions). Just don’t include –fopenmp in CFLAGS. ParProg 2019 Feedback Assignment 2 Sven Köhler Chart 4

  5. Heat Map: And the winner was (A1) … ./heatmap 1000 1000 1000 random.csv (4 runs) 120 3300** 100 80 seconds 60 ParProg 2019 40 Feedback 30,54 28,27 Assignment 2 20 12,18 12,41 Sven Köhler 0,00* 0 Chart 5 submission16003 submission16005 submission16002 submission15983 submission16006 submission16022

  6. Heat Map: And the winner is … ./heatmap 1000 1000 1000 random.csv 14 12,259 12 10 8 seconds 6 4,245 ParProg 2019 4 Feedback 2,432 Assignment 2 2 Sven Köhler 0,877 0,484 0 Chart 6 submission16245 submission16417 submission16405 submission16429 submission16427

  7. decrypt 2 ./decrypt ParProg 2019 Feedback Assignment 2 Sven Köhler user266;Osten3 user906;Bahnhof Chart 7

  8. Your Verification Data Dictionary: The 42 most common terms from Unit A Passwords: barbera:Gozsjkgq.2N62 SubstitionItIs gene:SqJwiPjc8z9OQ speedup0 grace:L3xIP64G5RVk6 NotSoLowLevelNoMore ian:MyIR7zQEkP3Mg partitioning sheelagh:7CYgbT6A0xsM6 FishEyes richard:oGlayhJ1bTXuE ArrogantHippy ParProg 2019 margarete:qRbG.QWxv9c.6 GuideMeToTheMoon Feedback Assignment 2 elon:UFy0LW2XSNPVo FlyVeryHigh Sven Köhler satoshi:Hqw9N3HL38lAw BurnAllYourPower Chart 8

  9. Good Idea or Bad Idea? #pragma omp parallel for shared(result1, result2) for (int i = 0; i < tasks.size(); i++) { /* ... */ if (result1.found && result2.found) continue; for (int j = 0; j < dictPasswords.size(); j++) { auto password = dictPasswords[j]; struct crypt_data data; data.initialized = 0; /* ... */ ParProg 2019 L Only two results, not synchronization on vars } Feedback Assignment 2 } Sven Köhler L Wide jumps through dict-data (dict >> tasks) for locality swap loops Chart 9

  10. Good Idea or Bad Idea? struct crypt_data data; /* ... */ data.initialized = 0; { if (strcmp(crypt_r((password + "0").c_str(), salt, &data), hash) == 0) { /* ... */ break; } if (strcmp(crypt_r((password + "1").c_str(), salt, &data), hash) == 0) { /* ... */ ParProg 2019 break; Feedback Assignment 2 } L Loop unrolling only helps with tight loops Sven Köhler /* ... */ L Potential overhead for string buffer allocation+free } Chart 10

  11. Good Idea or Bad Idea? #pragma omp parallel shared(db,dict) { #pragma omp master { uint64_t last = 0; for (uint64_t i = 0; i < db_size; i++) { /* iterate through entire db character by character */ if (db[i] == '\n’) { /* if we are at a newline */ db[i] = '\0'; /* 0-terminate user entry */ #pragma omp task crack_user(db+last); ParProg 2019 J Start tasks while parsing input last = i+1; Feedback Assignment 2 } Sven Köhler } } Chart 11 #pragma omp taskwait }

  12. Good Idea or Bad Idea? while(dictFile >> word) { if (common_8_prefix(word, previousWord)) { continue; } previousWord = word; words.emplace_back(word); } J Smart reduction of problem size ParProg 2019 Feedback crypt(3) only operates on first 8 chars of input Assignment 2 Sven Köhler Chart 12

  13. decrypt: And the winner is … ./decrypt taskCryptPw.txt taskCryptDict.txt 3500 3210,346 3000 2500 seconds 2000 1570,16 1500 1000 662,597 ParProg 2019 500 305,585 274,29 Feedback 62,151 51,305 Assignment 2 0 1 8 3 8 8 9 4 Sven Köhler 0 0 1 8 2 0 0 4 4 4 3 4 4 4 6 6 6 6 6 6 6 1 1 1 1 1 1 1 n n n n n n n o o o o o o o i i i i i i i s s s s s s s s s s s s s s i i i i i i i m m m m m m m Chart 13 b b b b b b b u u u u u u u s s s s s s s

  14. Hash Ordered Index 3 ./hoi ParProg 2019 Feedback Assignment 2 Sven Köhler Chart 14

  15. How to MD5? Provide own implementation • Use OpenSSL • Use Glibc • ParProg 2019 Feedback Assignment 2 Sven Köhler Chart 15

  16. Good Idea or Bad Idea? #pragma omp parallel shared(hashes) firstprivate(seed) K Manual scheduling against paradigm { use scheduling clause, if really needed const uint_fast32_t thread = omp_get_thread_num(); const uint_fast32_t threads = omp_get_num_threads(); const uint_fast32_t from = thread*(blocks/threads); const uint_fast32_t to = (thread != threads-1) ? (thread+1)*(blocks/threads) : blocks; add(seed, from); for (unsigned int i = from; i < to; i++) { __uint128_t v = md5(seed); ParProg 2019 #pragma omp critical L Use std::vector::reserve and index Feedback Assignment 2 hashes.push_back(v); operations to get rid of synchro-needs Sven Köhler inc(seed); } Chart 16 }

  17. Good Idea or Bad Idea? std::sort(hashes.begin(),hashes.end(),less); L Serial by default better use task-parallelism with OpenMP Since C++17: std::execution::parallel_policy int max_query = *std::max_element(queries.begin(), queries.end()); if (max_query > 0.8f * n) std::sort(hashes.begin(), hashes.end()); else ParProg 2019 std::partial_sort(hashes.begin(), hashes.begin() + max_query + 1, hashes.end()); Feedback Assignment 2 Sven Köhler Chart 17

  18. Good Idea or Bad Idea? void qsort(unsigned char data[][MD5_DIGEST_LENGTH], unsigned int left, unsigned int right) { if (left < right) { auto pivot = qpartition(data, left, right); #pragma omp parallel sections { K good, but can do faster with #pragma omp section #pragma omp task if (pivot > 0) { (better task distribution, pot. higher qsort(data, left, pivot - 1); data locality) } #pragma omp section ParProg 2019 if (pivot < right - 1) { Feedback Assignment 2 qsort(data, pivot + 1, right); Sven Köhler } } Chart 18 } }

  19. Hint: Use Pipelining Use pipelining to reduce allocated memory size and reduce possible paging. ParProg 2019 Feedback Assignment 2 Sven Köhler Chart 19

  20. HOI: And the winner is … ./hoi deadc0deba5e 268435456 0 32768 268435453 500 459,178 450 381,494 400 350 300 seconds 250 200 150 ParProg 2019 Feedback 100 69,869 Assignment 2 62,887 50 Sven Köhler 17,566 18,620 0 submission16430 submission16421 submission16398 submission16419 submission16424 submission16410 Chart 20

  21. ^D end ParProg 2019 Feedback Assignment 2 Sven Köhler Chart 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend