Mysql
 sql >> Teknologi Basis Data >  >> RDS >> Mysql

Mengubah dua kerangka data di spark sql

Berikut adalah contoh kode yang dapat Anda gunakan, yang akan menghasilkan output ini:

Kodenya terlihat seperti ini :

val df1=sc.parallelize(Seq((0,"v1"),(0,"v2"),(1,"v3"),(1,"v1"))).toDF("id","values")
val df2=sc.parallelize(Seq((0,"a1","b1","-"),(1,"a2","-","b2"))).toDF("id","v1","v2","v3")
val joinedDF=df1.join(df2,"id")
val resultDF=joinedDF.rdd.map{row=>
val id=row.getAs[Int]("id")
val values=row.getAs[String]("values")
val feilds=row.getAs[String](values)
(id,values,feilds)
}.toDF("id","values","feilds")

Saat menguji di Konsol:

scala> val df1=sc.parallelize(Seq((0,"v1"),(0,"v2"),(1,"v3"),(1,"v1"))).toDF("id","values")
df1: org.apache.spark.sql.DataFrame = [id: int, values: string]

scala> df1.show
+---+------+
| id|values|
+---+------+
|  0|    v1|
|  0|    v2|
|  1|    v3|
|  1|    v1|
+---+------+


scala> val df2=sc.parallelize(Seq((0,"a1","b1","-"),(1,"a2","-","b2"))).toDF("id","v1","v2","v3")
df2: org.apache.spark.sql.DataFrame = [id: int, v1: string ... 2 more fields]

scala> df2.show
+---+---+---+---+
| id| v1| v2| v3|
+---+---+---+---+
|  0| a1| b1|  -|
|  1| a2|  -| b2|
+---+---+---+---+


scala> val joinedDF=df1.join(df2,"id")
joinedDF: org.apache.spark.sql.DataFrame = [id: int, values: string ... 3 more fields]

scala> joinedDF.show
+---+------+---+---+---+                                                        
| id|values| v1| v2| v3|
+---+------+---+---+---+
|  1|    v3| a2|  -| b2|
|  1|    v1| a2|  -| b2|
|  0|    v1| a1| b1|  -|
|  0|    v2| a1| b1|  -|
+---+------+---+---+---+


scala> val resultDF=joinedDF.rdd.map{row=>
     | val id=row.getAs[Int]("id")
     | val values=row.getAs[String]("values")
     | val feilds=row.getAs[String](values)
     | (id,values,feilds)
     | }.toDF("id","values","feilds")
resultDF: org.apache.spark.sql.DataFrame = [id: int, values: string ... 1 more field]

scala> 

scala> resultDF.show
+---+------+------+                                                             
| id|values|feilds|
+---+------+------+
|  1|    v3|    b2|
|  1|    v1|    a2|
|  0|    v1|    a1|
|  0|    v2|    b1|
+---+------+------+

Saya harap ini mungkin masalah Anda. Terima kasih!




  1. Database
  2.   
  3. Mysql
  4.   
  5. Oracle
  6.   
  7. Sqlserver
  8.   
  9. PostgreSQL
  10.   
  11. Access
  12.   
  13. SQLite
  14.   
  15. MariaDB
  1. PHP membuat tabel kesalahan 1064

  2. Spring 3 MVC + MySQL:tidak dapat menyimpan karakter €

  3. NodeJS:Di mana menghubungkan ke database dalam kode?

  4. cara mengimplementasikan perintah sql yang rumit

  5. Mengapa mesin MyISAM MySQL tidak mendukung kunci Asing?