also suppose I have a struct which looks like this
Code:
typedef struct {
int nx; /* no. of cells in x-direction */
int ny; /* no. of cells in y-direction */
int max_iters; /* no. of iterations */
int reynolds_dim; /* dimension for Reynolds number */
float density; /* density per link */
float accel; /* density redistribution */
float omega; /* relaxation parameter */
} param_t;
Code:
typedef struct {
float speeds[9];
} speed_t;
Code:
/* Allocate arrays */
*cells_ptr = (speed_t*) malloc(sizeof(speed_t)*(params->ny*params->nx));
if (*cells_ptr == NULL) DIE("Cannot allocate memory for cells");
and basically in each iteration of a loop i do something like
cells[iterator].speed[from 0 to 8]= somecalculation();
and I want to parallelise it s.t
thread1 does
cells[iterator].speed[from 0 to 8]= somecalculation();
and thread2 does
cells[iterator+1].speed[from 0 to 8]= somecalculation();
i thought false sharing wouldn't be happening because i expect cells[0] & cells[1] to be on different cache lines, however I am not entirely sure that this is the case. can someone please tell me how
the speed_t struct is stored in memory, is it essentially the same as cells[x][9]?