digraph G {
0 [labelType="html" label="<b>Execute InsertIntoHadoopFsRelationCommand</b><br><br>number of written files: 1<br>written output: 202.4 KiB<br>number of output rows: 3,189<br>number of dynamic part: 0"];
1 [labelType="html" label="<b>Exchange</b><br><br>shuffle records written: 3,189<br>shuffle write time total (min, med, max (stageId: taskId))<br>45 ms (0 ms, 0 ms, 0 ms (stage 63.0: task 925))<br>records read: 3,189<br>local bytes read: 119.8 KiB<br>fetch wait time: 0 ms<br>remote bytes read: 103.9 KiB<br>local blocks read: 106<br>remote blocks read: 94<br>data size total (min, med, max (stageId: taskId))<br>249.1 KiB (320.0 B, 1200.0 B, 1920.0 B (stage 63.0: task 842))<br>shuffle bytes written total (min, med, max (stageId: taskId))<br>223.6 KiB (329.0 B, 1085.0 B, 1691.0 B (stage 63.0: task 853))"];
subgraph cluster2 {
isCluster="true";
label="WholeStageCodegen (2)\n \nduration: total (min, med, max (stageId: taskId))\n9.6 s (5 ms, 15 ms, 436 ms (stage 63.0: task 1008))";
3 [labelType="html" label="<br><b>Project</b><br><br>"];
4 [labelType="html" label="<b>Filter</b><br><br>number of output rows: 3,189"];
5 [labelType="html" label="<b>HashAggregate</b><br><br>time in aggregation build total (min, med, max (stageId: taskId))<br>9.1 s (3 ms, 10 ms, 433 ms (stage 63.0: task 1008))<br>peak memory total (min, med, max (stageId: taskId))<br>3.2 GiB (16.5 MiB, 16.5 MiB, 16.5 MiB (stage 63.0: task 837))<br>number of output rows: 2,257,384<br>avg hash probe bucket list iters (min, med, max (stageId: taskId)):<br>(1.4, 1.4, 1.4 (stage 63.0: task 837))"];
}
6 [labelType="html" label="<b>Exchange</b><br><br>shuffle records written: 2,259,065<br>shuffle write time total (min, med, max (stageId: taskId))<br>159 ms (0 ms, 0 ms, 90 ms (stage 62.0: task 833))<br>records read: 2,259,065<br>local bytes read total (min, med, max (stageId: taskId))<br>69.3 MiB (334.9 KiB, 355.3 KiB, 377.6 KiB (stage 63.0: task 1012))<br>fetch wait time total (min, med, max (stageId: taskId))<br>7.8 s (0 ms, 2 ms, 429 ms (stage 63.0: task 1008))<br>remote bytes read total (min, med, max (stageId: taskId))<br>69.2 MiB (338.5 KiB, 354.5 KiB, 370.4 KiB (stage 63.0: task 877))<br>local blocks read: 200<br>remote blocks read: 200<br>data size total (min, med, max (stageId: taskId))<br>189.6 MiB (0.0 B, 0.0 B, 96.6 MiB (stage 62.0: task 833))<br>shuffle bytes written total (min, med, max (stageId: taskId))<br>138.5 MiB (0.0 B, 0.0 B, 70.6 MiB (stage 62.0: task 833))"];
subgraph cluster7 {
isCluster="true";
label="WholeStageCodegen (1)\n \nduration: total (min, med, max (stageId: taskId))\n2.3 s (15 ms, 56 ms, 1.2 s (stage 62.0: task 833))";
8 [labelType="html" label="<b>HashAggregate</b><br><br>time in aggregation build total (min, med, max (stageId: taskId))<br>1.3 s (13 ms, 49 ms, 679 ms (stage 62.0: task 833))<br>peak memory total (min, med, max (stageId: taskId))<br>336.8 MiB (256.0 KiB, 256.0 KiB, 192.0 MiB (stage 62.0: task 833))<br>number of output rows: 2,259,065<br>avg hash probe bucket list iters (min, med, max (stageId: taskId)):<br>(1.6, 1.7, 1.7 (stage 62.0: task 835))"];
9 [labelType="html" label="<b>ColumnarToRow</b><br><br>number of output rows: 2,260,701<br>number of input batches: 553"];
}
10 [labelType="html" label="<b>Scan parquet itv024694_lending_club.loans_defaulters_detail_rec_enq</b><br><br>number of files read: 2<br>scan time total (min, med, max (stageId: taskId))<br>221 ms (13 ms, 49 ms, 78 ms (stage 62.0: task 835))<br>metadata time: 0 ms<br>size of files read: 144.3 MiB<br>number of output rows: 2,260,701"];
1->0;
3->1;
4->3;
5->4;
6->5;
8->6;
9->8;
10->9;
}
11
Execute InsertIntoHadoopFsRelationCommand hdfs://m01.itversity.com:9000/user/itv024694/bad_data/bad_data_loan_details, false, CSV, [header=true, path=/user/itv024694/bad_data/bad_data_loan_details], Overwrite, [member_id]
Exchange RoundRobinPartitioning(1), REPARTITION_WITH_NUM, [id=#586]
Project [member_id#625]
Filter (total_count#623L > 1)
HashAggregate(keys=[member_id#625], functions=[count(1)])
WholeStageCodegen (2)
Exchange hashpartitioning(member_id#625, 200), ENSURE_REQUIREMENTS, [id=#580]
HashAggregate(keys=[member_id#625], functions=[partial_count(1)])
ColumnarToRow
WholeStageCodegen (1)
FileScan parquet itv024694_lending_club.loans_defaulters_detail_rec_enq[member_id#625] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[hdfs://m01.itversity.com:9000/public/trendytech/lendingclubproject/cleaned/loan..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<member_id:string>