cant add user to sudo group in centOS 7 i386(no GUI,Its minimal)

centos command-line sudo users privileges

1,548

You just need to add user Smit to group wheel which is have permission to run all commands with sudo command And you can accomplish it by entering the following command

vim /etc/group

look for wheel group and add smit to it

save and exit and thats it.

1,548

karnajitsen

Updated on September 18, 2022

Comments

karnajitsen over 1 year
I have to implement Khatri Rao product between 2 matrices in C. Mathematically this is a column major access of data and I can not change that. But if I use preload ( PLD instruction in ARMv7 ) to prefetch every next loop data will that solve the problem of performance in stead of using a row major access of data.

If yes how to preload properly?

Please check my preload code below,
```
void khatrirao_pref(double *C, double *A, double *B,
                  int nmax, int mmax, int pmax)
 {
  int i,k,l;
  for (i=0;i<nmax;i++)
    {
    for (k=0;k<mmax;k++)
      {
        asm("PLD [%0]\n\t" :: "r" (A+i+((nmax+1)*k)));       
      for (l=0;l<pmax;l++)
    {
            asm("PLD [%0]\n\t" :: "r" (B+i+((nmax+1)*l)));
           C[i+(nmax*((k*pmax)+l))]=A[i+(nmax*k)]*B[i+(nmax*l)];
    }}}
 }
```
- Jonathan Leffler about 8 years
  
  If you're always going to use column-major order, consider whether you can treat columns as rows and rows as columns by reversing the sense of array indices — where you had A[row][col], use A[col][row]. That gives you the caching benefit of accessing the data in memory sequence. It is not something to undertake lightly — measure and test very carefully.
- karnajitsen about 8 years
  
  @JonathanLeffler Hi Jonathan, Thank you for the reply. But I cannot do so. I have to strictly stay in collumn major access. I can not change the inner equation or the order of the 3 loops or the dimensions of arrays. I can only use prefetch to get the next loop data of A and B of the same column. I know this is kind of odd I am asking. What do you think ?
- BitBank about 8 years
  
  Going "against the grain" with your memory access pattern will definitely hurt performance. A better plan would be to prefetch the entire array into cache before it arrives at this code. As written, you're prefetching won't help; you need to give a hint to the CPU and then give it time to actually do the reading.
- karnajitsen about 8 years
  
  @BitBank: I tried to prefetch everything before, it was not helpful so much .
- artless noise about 8 years
  
  Try compiler switches. gcc knows about the PLD instruction and should use it. You can use compile specific options for a function. It is not likely to help manually if gcc doesn't think it will by itself; pld will function best with some unrolling to only issue a pld to fill the cache lines. Too many pld is harmful.
- karnajitsen about 8 years
  
  I receive 7% improvement from PLD. How much maximum speed up can be achieved? any guess.
- Todd Christensen about 8 years
  
  It really depends, but if it were me I would try for a little more than 7%. Since you're multiplying doubles, memory access may not be the most dominant part of the time... PLD can't help with time spent in the FP unit. 10-15% might be achievable, as a guess.
- Admin almost 7 years
  
  i had used the command in both manner.without <> and with <>
karnajitsen about 8 years

but I am using ARMv7 processor. Will __builtin_prefetch work there? I think PLD should be the instruction to do so.
Todd Christensen about 8 years

Yes, that's how you tell gcc/clang to emit that instruction.