Automatic parallelism through macro dataflow in high-level array languages (English)
- New search for: Ratnalikar, Pushkar
- New search for: Chauhan, Arun
- New search for: Ratnalikar, Pushkar
- New search for: Chauhan, Arun
In:
2014 23rd International Conference on Parallel Architecture and Compilation (PACT)
;
489-490
;
2014
-
ISBN:
- Conference paper / Electronic Resource
-
Title:Automatic parallelism through macro dataflow in high-level array languages
-
Contributors:Ratnalikar, Pushkar ( author ) / Chauhan, Arun ( author )
-
Published in:
-
Publisher:
- New search for: IEEE
-
Publication date:2014-08-01
-
Size:303949 byte
-
ISBN:
-
DOI:
-
Type of media:Conference paper
-
Type of material:Electronic Resource
-
Language:English
-
Source:
Table of contents conference proceedings
The tables of contents are generated automatically and are based on the data records of the individual contributions available in the index of the TIB portal. The display of the Tables of Contents may therefore be incomplete.
- 1
-
Keynote: Internet of mobile things: Challenges and opportunitiesNahrstedt, Klara et al. | 2014
- 3
-
Virtues and limitations of commodity hardware transactional memoryDiegues, Nuno / Romano, Paolo / Rodrigues, Luis et al. | 2014
- 15
-
Cooperative cache scrubbingSartor, Jennifer B. / Heirman, Wim / Blackburn, Stephen M. / Eeckhout, Lieven / McKinley, Kathryn S. et al. | 2014
- 27
-
KLA: A new algorithmic paradigm for parallel graph computationsHarshvardhan, / Fidel, Adam / Amato, Nancy M. / Rauchwerger, Lawrence et al. | 2014
- 39
-
Tiling and optimizing time-iterated computations over periodic domainsBondhugula, Uday / Bandishti, Vinayaka / Cohen, Albert / Potron, Guillain / Vasilache, Nicolas et al. | 2014
- 51
-
ATCache: Reducing DRAM cache latency via a small SRAM tag cacheHuang, Cheng-Chieh / Nagarajan, Vijay et al. | 2014
- 61
-
SpongeDirectory: Flexible sparse directories utilizing multi-level memristorsZhang, Lunkai / Strukov, Dmitri / Saadeldeen, Hebatallah / Fan, Dongrui / Zhang, Mingzhe / Franklin, Diana et al. | 2014
- 75
-
EFetch: Optimizing instruction fetch for event-driven web applicationsChadha, Gaurav / Mahlke, Scott / Narayanasamy, Satish et al. | 2014
- 87
-
XStream: Cross-core spatial streaming based MLC prefetchers for parallel applications in CMPsPanda, Biswabandan / Balachandran, Shankar et al. | 2014
- 99
-
What is the cost of weak determinism?Segulja, Cedomir / Abdelrahman, Tarek S. et al. | 2014
- 113
-
ILP and TLP in shared memory applications: A limit studyFatehi, Ehsan / Gratz, Paul V. et al. | 2014
- 127
-
Versatile and scalable parallel histogram constructionJung, Wookeun / Park, Jongsoo / Lee, Jaejin et al. | 2014
- 139
-
Bitwise data parallelism in regular expression matchingCameron, Robert D. / Shermer, Thomas C. / Shriraman, Arrvindh / Herdy, Kenneth S. / Lin, Dan / Hull, Benjamin R. / Lin, Meng et al. | 2014
- 151
-
Adaptive heterogeneous scheduling for integrated GPUsKaleem, Rashid / Barik, Rajkishore / Shpeisman, Tatiana / Hu, Chunling / Lewis, Brian T. / Pingali, Keshav et al. | 2014
- 163
-
Warp-aware trace scheduling for GPUsJablin, James A. / Jablin, Thomas B. / Mutlu, Onur / Herlihy, Maurice et al. | 2014
- 175
-
CAWS: Criticality-aware warp scheduling for GPGPU workloadsLee, Shin-Ying / Wu, Carole-Jean et al. | 2014
- 187
-
Invyswell: A hybrid transactional memory for Haswell's restricted transactional memoryCalciu, Irina / Gottschlich, Justin / Shpeisman, Tatiana / Herlihy, Maurice / Pokam, Gilles et al. | 2014
- 201
-
Consolidated conflict detection for hardware transactional memoryZhao, Lihang / Draper, Jeffrey et al. | 2014
- 213
-
DeSTM: Harnessing determinism in STMs for application developmentPande, Santosh / Gavrilovska, Ada / Ravichandran, Kaushik et al. | 2014
- 225
-
PATS: Pattern aware scheduling and power gating for GPGPUsXu, Qiumin / Annavaram, Murali et al. | 2014
- 237
-
Heterogeneous microarchitectures trump voltage scaling for low-power coresLukefahr, Andrew / Padmanabha, Shruti / Das, Reetuparna / Dreslinski, Ronald / Wenisch, Thomas F. / Mahlke, Scott et al. | 2014
- 251
-
RCS: Runtime resource and core scaling for power-constrained multi-core processorsGhasemi, Hamid Reza / Kim, Nam Sung et al. | 2014
- 263
-
Realm: An event-based low-level runtime for distributed memory architecturesAiken, Alex / Bauer, Michael / Treichler, Sean et al. | 2014
- 277
-
kMAF: Automatic kernel-level management of thread and data affinityDiener, Matthias / Cruz, Eduardo H. M. / Navaux, Philippe O. A. / Busse, Anselm / Heis, Hans-Ulrich et al. | 2014
- 289
-
Shuffling: A framework for lock contention aware thread scheduling for multicore multiprocessor systemsKumar, Kishore / Rajiv, Pusukuri / Laxmi, Gupta / Bhuyan, N. et al. | 2014
- 301
-
Keynote: Domain-specific models for innovation in analyticsBlainey, Bob et al. | 2014
- 303
-
OpenTuner: An extensible framework for program autotuningAnsel, Jason / Kamil, Shoaib / Veeramachaneni, Kalyan / Ragan-Kelley, Jonathan / Bosboom, Jeffrey / O'Reilly, Una-May / Amarasinghe, Saman et al. | 2014
- 317
-
Velociraptor: An embedded compiler toolkit for numerical programs targeting CPUs and GPUsGarg, Rahul / Hendren, Laurie et al. | 2014
- 331
-
Memory scheduling towards high-throughput cooperative heterogeneous computingWang, Hao / Singh, Ripudaman / Schulte, Michael J. / Kim, Nam Sung et al. | 2014
- 343
-
Bounded memory scheduling of dynamic task graphsSbirlea, Dragos / Budimlic, Zoran / Sarkar, Vivek et al. | 2014
- 357
-
Trading cache hit rate for memory performanceDing, Wei / Kandemir, Mahmut / Guttman, Diana / Jog, Adwait / Das, Chita R. / Yedlapalli, Praveen et al. | 2014
- 369
-
Compiler support for selective page migration in NUMA architecturesPiccoli, Guilherme / Santos, Henrique N. / Rodrigues, Raphael E. / Pousa, Christiane / Borin, Edson / Magno, Fernando et al. | 2014
- 381
-
COLORIS: A dynamic cache partitioning system using page coloringYe, Ying / West, Richard / Cheng, Zhuoqun / Li, Ye et al. | 2014
- 393
-
PEMOGEN: Automatic adaptive performance modeling during program runtimeBhattacharyya, Arnamoy / Hoefler, Torsten et al. | 2014
- 405
-
ArrayTool: A lightweight profiler to guide array regroupingLiu, Xu / Sharma, Kamal / Mellor-Crummey, John et al. | 2014
- 417
-
Design for scalability in enterprise SSDsTavakkol, Arash / Arjomand, Mohammad / Sarbazi-Azad, Hamid et al. | 2014
- 431
-
D2MA: Accelerating coarse-grained data transfer for GPUsJamshidi, D. Anoushe / Samadi, Mehrzad / Mahlke, Scott et al. | 2014
- 443
-
VAST: The illusion of a large memory space for GPUsLee, Janghaeng / Samadi, Mehrzad / Mahlke, Scott et al. | 2014
- 455
-
Automatic optimization of thread-coarsening for graphics processorsMagni, Alberto / Dubach, Christophe / O'Boyle, Michael et al. | 2014
- 467
-
Automatic execution of single-GPU computations across multiple GPUsCabezas, Javier / Vilanova, Lluis / Geladeno, Isaac / Jablin, Thomas B. / Navarro, Nacho / Hwu, Wen-mei et al. | 2014
- 469
-
LCA: A memory link and cache-aware co-scheduling approach for CMPsHaritatos, Alexandros-Herodotos / Goumas, Georgios / Anastopoulos, Nikos / Nikas, Konstantinos / Kourtis, Kornilios / Koziris, Nectarios et al. | 2014
- 471
-
A run-time power manager exploiting software parallelismHolmbacka, Simon / Lafond, Sebastien / Lilius, Johan et al. | 2014
- 473
-
Graph-based performance accounting for chip multiprocessor memory systemsJahre, Magnus et al. | 2014
- 475
-
SQRL: Hardware accelerator for collecting software data structuresKumar, Snehasish / Shriraman, Arrvindh / Srinivasan, Vijayalakshmi / Lin, Dan / Phillips, Jordon et al. | 2014
- 477
-
Optimizing stencil code via locality of computationLuo, Yulong / Tan, Guangming et al. | 2014
- 479
-
ADHA: Automatic data layout framework for heterogeneous architecturesMajeti, Deepak / Meel, Kuldeep S. / Barik, Rajkishore / Sarkar, Vivek et al. | 2014
- 481
-
Active learning accelerated automatic heuristic construction for parallel program mappingOgilvie, William F. / Petoumenos, Pavlos / Wang, Zheng / Leather, Hugh et al. | 2014
- 483
-
Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernelsPai, Sreepathi / Govindarajan, R. / Thazhuthaveetil, Matthew J. et al. | 2014
- 485
-
Using STT-RAM to enable energy-efficient near-threshold chip multiprocessorsPan, Xiang / Teodorescu, Radu et al. | 2014
- 487
-
Protection and utilization in shared cache through rationingParihar, Raj / Brock, Jacob / Ding, Chen / Huang, Michael C. et al. | 2014
- 489
-
Automatic parallelism through macro dataflow in high-level array languagesRatnalikar, Pushkar / Chauhan, Arun et al. | 2014
- 491
-
A runtime support mechanism for fast mode switching of a self-morphing core for power efficiencySrinivasan, Sudarshan / Kurella, Nithesh / Koren, Israel / Kundu, Sandip / Rodrigues, Rance et al. | 2014
- 493
-
Rollback-free value prediction with approximate loadsThwaites, Bradley / Pekhimenko, Gennady / Esmaeilzadeh, Hadi / Yazdanbakhsh, Amir / Park, Jongse / Mururu, Girish / Mutlu, Onur / Mowry, Todd et al. | 2014
- 495
-
Measuring flexibility in single-ISA heterogeneous processorsTomusk, Erik / Dubach, Christophe / O'Boyle, Michael et al. | 2014
- 497
-
SM-centric transformation: Circumventing hardware restrictions for flexible GPU schedulingWu, Bo / Chen, Guoyang / Li, Dong / Shen, Xipeng / Vetter, Jeffrey S. et al. | 2014
- 499
-
An event-based language for dynamic binary translation frameworksMakarov, Serguei / Brown, Angela Demke / Goel, Ashvin et al. | 2014
- 501
-
Improving performance of streaming applications with filtering and control messagesLi, Peng / Buhler, Jeremy et al. | 2014
- 503
-
Stratified sampling for even workload partitioningPaudel, Jeeva / Amaral, Jose Nelson et al. | 2014
- 505
-
Design of a hybrid MPI-CUDA benchmark suite for CPU-GPU clustersAgarwal, Tejaswi / Becchi, Michela et al. | 2014
- 507
-
Data remapping for an energy efficient burst chop in DRAM memory systemsJagathrakshakan, Sudharsan / Tavva, Venkata Kalyan / Mutyam, Madhu et al. | 2014
- 509
-
Data-reuse optimizations for pipelined tiling with parametric tile sizesIsoard, Alexandre et al. | 2014
- 511
-
From petascale to the pocket: Adaptively scaling parallel programs for mobile SoCsFidel, Adam / Amato, Nancy M. / Rauchwerger, Lawrence et al. | 2014
- 513
-
Coarrays in GNU FortranFanfarillo, Alessandro / Burnus, Tobias / Cardellini, Valeria / Filippone, Salvatore / Nagle, Dan / Rouson, Damian et al. | 2014
- 515
-
Locality-aware memory association for multi-target worksharing in OpenMPScogland, Thomas R. W. / Feng, Wu-Chun et al. | 2014
- 517
-
Processing big data graphs on memory-restricted systemsHarshvardhan, / Amato, Nancy M. / Rauchwerger, Lawrence et al. | 2014
- 519
-
Author index| 2014
- i
-
Front matters| 2014