This week, I started working on the faster scrolling aspect. This was closely related to virtual scrolling, so it was time I got into that as well.

Looking at other libraries that use virtual scrolling, something that seems to be common amongst them is that they keep the data loaded to front-end beforehand. This gives them an advantage to rapidly load rows as the user scrolls in a particular direction. But this was something that could not be implemented in OpenRefine because firstly - Loading the entire dataset would freeze up the screen for a long time (since rendering takes up a lot of time), and secondly - the rendering wasn’t designed to load a single row at a time and add it to the table. Although the latter is something that could be manipulated, I suspect it would affect the performance. I am thinking of trying it out once the prototype is ready, to see if it could smoothen the entire process of scrolling.

Right now, I have the pageSize set to 100, so if the user was scrolling very fast, we could take the scroll position to fire off the rendering functions and load the rows corresponding to that position. A thing that I noticed was: if you don’t wait for a user to stop scrolling before firing the rendering functions, the rendering often starts at a scroll position way above where the user intends to go. The problem with that is, javascript is a single-threaded language, so once the rendering functions start executing, there is no way to keep track of whether a user is still scrolling or not. This meant that the rows way above the final scroll position would be loaded since rendering the rows takes up a lot of time). A way to solve this would be to execute these functions only when the user stops scrolling, or in better terms, the scroll event itself stops.

I changed the scroll function to keep track of three elements, the first row, the last row, and the row with class load-next-set. The first row was introduced to push down the rendered rows to where they would be if all the rows were loaded. Both first-row and last-row are empty rows with no cell and therefore should not affect any operations that a user might perform, although this would require deeper testing to confirm.

So now we wait for 1/4th of a second to check whether the user has stopped scrolling.

$(table.parentNode.parentNode).bind('scroll', function(evt) {
    var scrollTop = $(this).scrollTop();

 	var element = document.querySelector('.load-next-set');
    var position = element.getBoundingClientRect();
    var element2 = document.querySelector('.last-row');
    var position2 = element2.getBoundingClientRect();
    var element3 = document.querySelector('.first-row');
    var position3 = element3.getBoundingClientRect();
    if((position.top >= 0 && position.bottom <= window.innerHeight) || 
      (position2.top < window.innerHeight && position2.top > 0  && position2.bottom + 50 >= window.innerHeight)) {
      console.log('Loading next set');
      self._onBottomTable(table, this, evt);
    }

    clearTimeout($.data(this, 'scrollTimer'));
    $.data(this, 'scrollTimer', setTimeout(function() {
      if(position2.top <= 0 && position2.bottom >= 0) {
        console.log('Last row is partially visible in screen');
        var goto = self.getPageNumberSrcolling(scrollTop);
        self._onChangeGotoScrolling(scrollTop, goto, table);
      }

      if(position3.top <= 150 && position3.bottom >= 0) {
        console.log('First row is partially visible in screen');
        var goto = self.getPageNumberSrcolling(scrollTop);
        self._onChangeGotoScrolling(scrollTop, goto, table);
      }
    }, 250));
});

When the last row or first row comes into view, we first check which page number corresponds to that scroll position and then call the onChangeGotoScrolling() method.

DataTableView.prototype.getPageNumberSrcolling = function(scrollPosition) {
  var num = Math.floor(scrollPosition / this._sizeSinglePage);
  return num;
}

The page number is calculated by dividing the scroll position by the predicted size of a single page. Since the two functions for both last row and first row are similar except the console outputs, I will put them together later.

DataTableView.prototype._onChangeGotoScrolling = function(scrollPosition, gotoPageNumber, table, elmt, evt) {
  this._currentPageNumber = gotoPageNumber;
  var modifiedScrollPosition = this._sizeRowFirst * gotoPageNumber * this._pageSize;
  this._showRowsBottomSpeed(modifiedScrollPosition, scrollPosition, table, (gotoPageNumber - 1) * this._pageSize);
};

This function first calculates a modifiedScrollPosition which is the height that needs to be added to the first row of the loaded set. It’s modified in the sense that it calculates the height of the first row using the predicted singlePageSize and then finds out how much space the given number of pages would have required.

DataTableView.prototype._showRowsBottomSpeed = function(modifiedScrollPosition, scrollPosition, table, start, onDone) {
  var self = this;

  this._totalSize = start +  this._pageSize;
  $('tr.load-next-set').removeClass('load-next-set');

  Refine.fetchRows(start, this._pageSize, function() {
    $('.last-row').remove();

    loadRows();
    self._adjustNextSetClassesSpeed(modifiedScrollPosition);
    setScroll(scrollPosition);

    if (onDone) {
      onDone();
    }
  }, this._sorting);
};

showRowsBottomSpeed is pretty similar to showRowsBottom with the addition of a new adjustNextSetClasses and the scrolling position which needs to be set back to where it was before the rows were loaded (through setScroll().

DataTableView.prototype._adjustNextSetClassesSpeed = function(modifiedScrollPosition) {
  var heightToAddTop = modifiedScrollPosition;
  var heightToAddBottom = Math.max(0, this._sizeRowsTotal - (modifiedScrollPosition + this._sizeSinglePage));

  $('tbody tr').slice(1, $('tbody tr').length - this._pageSize).remove();

  $('tbody tr:first').css('height', heightToAddTop);

  document.querySelector('.data-table').insertRow();
  $('tr:last').css('height', heightToAddBottom);
  $('tr:last').addClass('last-row');

  if (theProject.rowModel.mode == "record-based") {
    $('tr.record').eq(-51).addClass('load-next-set')
  } else {
    $('tr').eq(-52).addClass('load-next-set')
  }
};

adjustNextSetClasses function calculates the required heights to add to the top and bottom of the loaded set. In addition, we also need to delete the previous set of rows (for virtual scrolling).

var setScroll = function(scrollPosition) {
    console.log("setScroll");
    $('.data-table-container').scrollTop(scrollPosition);
};

setScroll sets the scroll position back to where it was before the rows were loaded.

Conclusion

Fast scrolling works for both upward and downward direction. The next step is to work on the upward direction for normal scrolling, which would also require work on the virtual scroll. The next post goes into detail on some of my ideas regarding this.