Why does my SQL Server use AWE memory? and why is this not visible in RAMMap?
Does the SQL server account have the "lock pages in memory" option?
In a nutshell, 1) allocating this way is a bit faster and 2) is require for SQL to prevent its memory from being paged out.
Is there a specific reason you don't want SQL to use AWE?
- On 64-bit systems (2005+):
- AWE is not required (and in fact enabling it does nothing)
- Turning on the "Locked Pages in Memory" privilege prevents the buffer pool memory (and anything that uses single-page-at-a-time memory allocations) from being paged out
- When the "Locked Pages in Memory" privilege is set, SQL Server uses the Windows AWE API to do memory allocations as it's a little bit faster
- "Locked Pages in Memory" is supported by Standard and Enterprise editions (see this blog post for how to enable it in Standard edition)
See also: Fun with Locked Pages, AWE, Task Manager, and the Working Set… - this explains why setting "use AWE" is false doesn't actually prevent use of AWE (it only is relevant on 32-bit)
Related videos on Youtube
Aris Kantas
Updated on September 18, 2022Comments
-
Aris Kantas over 1 year
I want to convert a Dataframe which contains Double values into a List so that I can use it to make calculations. What is your suggestion so that I can take a correct type List (i.e. Double) ?
My approach is this :
var newList = myDataFrame.collect().toList
but it returns a type List[org.apache.spark.sql.Row] which I don't know what it is exactly!
Is it possible to forget that step and simply pass my Dataframe inside a function and make calculation from it? (For example I want to compare the third element of its second column with a specific double. Is it possible to do so directly from my Dataframe?)
At any cost I have to understand how to create the right type List each time!
EDIT:
Input Dataframe:
+---+---+ |_c1|_c2| +---+---+ |0 |0 | |8 |2 | |9 |1 | |2 |9 | |2 |4 | |4 |6 | |3 |5 | |5 |3 | |5 |9 | |0 |1 | |8 |9 | |1 |0 | |3 |4 | |8 |7 | |4 |9 | |2 |5 | |1 |9 | |3 |6 | +---+---+
Result after conversion:
List((0,0), (8,2), (9,1), (2,9), (2,4), (4,6), (3,5), (5,3), (5,9), (0,1), (8,9), (1,0), (3,4), (8,7), (4,9), (2,5), (1,9), (3,6))
But every element in the List has to be Double type.
-
koiralo about 5 yearsCan you explain a bit more with inputs and expected results ?
-
Aris Kantas about 5 yearsHello! check again!
-
-
Aris Kantas about 5 yearsWhat type of list does this return?
-
Aris Kantas about 5 yearsThis approach takes one collumn of my DataFrame and pass it into a Sequence? How about taking all the collumns of my DataFrame? And can I use the Sequence the same way I would use a List?
-
Muhunthan about 5 yearsyes same way as List and myDataFrame.select("column1", "column2").collect().map(each => (each.getAs[String]("column1"), each.getAs[String]("column2")) ).toList You will get list of tuple here
-
Aris Kantas about 5 yearsThank you! Now I want to ask, what type of elements will exist in this List? String or Double?
-
Muhunthan about 5 yearsIn this example, I cast both columns as String. You must cast according to your data frame. in this example, you will get the list of the tuple ( List[(String, String)] )
-
Aris Kantas about 5 yearsI have provided my Dataframe a the latest edit on my post. As you can see I have numbers so my result has to be Double. How to change your suggestion in order to have the result I want?
-
Muhunthan about 5 yearsPlease check my Answer now
-
Aris Kantas about 5 yearsHello! I understand the logic behind your approach but I am losing it on the practical side as I am very new to scala. What I understand here is that you create a DataFrame with data that is standar. In my situation the dataframe which I have to create comes from a dataset file (csv) and it won't be standar every time. So how can I cast a column of an existing DataFrame into Double?
-
Aris Kantas about 5 yearsHello! It gives me this error:
error: too many arguments for method println: (x: Any)Unit myList.foreach(x=>println(x.a.getClass,x.b.getClass,x.c.getClass))
-
Aris Kantas about 5 yearsAlso when I run the "convert to rdd" command on my dataframe (without any cast changing) I get this error
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 333.0 failed 1 times, most recent failure: Lost task 0.0 in stage 333.0 (TID 333, localhost, executor driver): java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Double
-
koiralo about 5 yearsIn those cases you need to clean your data, need to remove nulls or remove some value that cannot be converted to Double and so on. It depends on your implementation. You can create a udf where you can clean all those things.
-
koiralo about 5 yearsYes the error is clear you need to cast the coulmn to Double
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Double
-
Aris Kantas about 5 yearsOh ok! I understand! That means I have to do much more digging arround! Thank you for your respnse!
-
koiralo about 5 years@ArisKantas I have added a sample udf please take a look. Hope this helps!
-
MIKHIL NAGARALE about 5 yearsWhich version of spark you’re using? Or you can check only individual element data types using - myList.foreach(x=>println(x.a.getClass)) likewise you can check Data types for individual columns or concatenate them in single string (it’s just to check data types).