Create Spark Dataset from MySQL table
Sample Program :
SparkSession sparkSession = SparkSession.builder().appName("mysqltable").master("local[2]").getOrCreate();
Map<String, String> jdbcOptionsMap = new HashMap<String, String>();
String jdbcUrl = "jdbc:mysql://"+hostname+":"+portNo+"/"+databasename;
jdbcOptionsMap.put("url", jdbcUrl);
jdbcOptionsMap.put("driver", <driver>);
jdbcOptionsMap.put("dbtable", <tableName>);
jdbcOptionsMap.put("user", <username>);
jdbcOptionsMap.put("password", <password>);
System.out.println("Reading From MySQL ");
Dataset<Row> mysqlData = sparkSession.read().format("jdbc").options(jdbcOptionsMap).load();
System.out.println("Column Names : " + Arrays.asList(tableData.columns()));
System.out.println("Table Schema : " + tableData.schema());
Bullet Points:
- In above code initially, I have created SparkSession to run spark job in standalone mode.
- In the second step, we should provide MySQL table details like its URL, driver, tableName, username, and password
- In the third step, we read MySQL table in JDBC format and load it into mysqlData Dataset using load command.
- We can know the table column names using columns() method
- We can view the table schema using schema() method