This is the mail archive of the gsl-discuss@sourceware.org mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Sparse matrix extension


I corrected _fscanf. There was an error when reading the comment header.

On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote:
> I'm in favor of simplicity and easy-parsing, so matrix market sounds good to
> me. I'll take a look at your latest code in the next few days.
>
> Patrick
>
>
> On 02/10/2016 06:16 AM, Alexis Tantet wrote:
>>
>> Hi Patrick,
>>
>> Regarding the file format for sparse matrices, the one I have coded
>> actually happen to be the coordinate format implemented by Matrix
>> Market (the platform to share test data such as sparse matrices), with
>> the addition of a matrix type header:
>> http://math.nist.gov/MatrixMarket/formats.html
>>
>> It is also written that "Harwell-Boeing" is the most commonly used
>> sparse matrix format, but that:
>> "Unfortunately the HB specification is somewhat complex difficult to
>> parse from languages other than Fortran, biased in favor of compressed
>> column representation and not easily extensible. Some of these factors
>> may have impeded the more widespread use of the HB format"
>>
>> It seems to me that complying to the Matrix Market coordinate format
>> would be the right choice, in terms of ease of implementation,
>> compliance to other packages and user-friendliness. I could update the
>> print/scan functions accordingly (mostly handling the header). What do
>> you think?
>>
>> Best,
>> Alexis
>>
>>
>>
>>
>> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com>
>> wrote:
>>>
>>> Ok, my mistake, now I see where I got confused.
>>> I had in mind to add all the elements first to the triplets and only
>>> while converting to compressed sum up the duplicates.
>>> While, indeed, if there's a way you can sum up the duplicates directly
>>> while adding them to the triplet matrix (thanks to _ptr), this is more
>>> handy and efficient.
>>>
>>> Thanks for the clarification,
>>> Alexis
>>>
>>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu>
>>> wrote:
>>>>
>>>> By design, gsl_spmatrix_set won't allow you to do this.
>>>>
>>>> If you add element (i, j, x) and then later to try add element (i, j,
>>>> y), gsl_spmatrix_set will detect that there exists an element in the (i,
>>>> j) spot and it will simply change x to y - the value of x will be
>>>> overwritten by y. This is the same behavior as gsl_matrix_set.
>>>>
>>>> So no duplicates are allowed by design. If you have such an application
>>>> where you want to keep track of duplicates, you could do the following:
>>>>
>>>> double *ptr = gsl_spmatrix_ptr(m, i, j);
>>>> if (ptr)
>>>>    *ptr += x; /* sum duplicate values */
>>>> else
>>>>     gsl_spmatrix_set(m, i, j, x);   /* initalize to x */
>>>>
>>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote:
>>>>>
>>>>> I'm not sure I got your last point. I have the following situation in
>>>>> mind:
>>>>>
>>>>> Start to construct a transition matrix in triplet format, adding one
>>>>> element after another.
>>>>> In this particular example, each element is one count of a transition
>>>>> from (state, box, etc.) i to j,
>>>>> so I add elements  (i, j, 1) to the triplet object, with possibly
>>>>> duplicates.
>>>>> What happen to these duplicates in the binary tree?
>>>>>
>>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates
>>>>> to be summed up, so that element (i, j) counts transitions from i to j
>>>>> (and no duplicates exist after compression).
>>>>>
>>>>> Is this more clear?
>>>>>
>>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu>
>>>>> wrote:
>>>>>>
>>>>>> Hi Alexis,
>>>>>>
>>>>>>>> I'm not sure what you mean. I've added a new function
>>>>>>>> gsl_spmatrix_ptr
>>>>>>>> to the git, which as far as I can tell does exactly what your
>>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j)
>>>>>>>> element,
>>>>>>>> and if found returns a pointer. If not found a null pointer is
>>>>>>>> returned.
>>>>>>>> This makes it easy for the user to modify A(i,j) after it has been
>>>>>>>> added
>>>>>>>> to the matrix. Are you thinking of something else? Can you point me
>>>>>>>> to
>>>>>>>> the Eigen routine?
>>>>>>>>
>>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress,
>>>>>>> with the difference that gsl_spmatrix_ptr is used instead of
>>>>>>> gsl_spmatrix_set,
>>>>>>> so has to build the compressed matrix from triplets, summing the
>>>>>>> duplicates, instead of replacing them.
>>>>>>> This is what is done here :
>>>>>>> The
>>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b
>>>>>>>
>>>>>>> Best,
>>>>>>> Alexis
>>>>>>
>>>>>> I'm not sure why a user would ever need to do this. The whole point of
>>>>>> the binary tree structure in the triplet storage is to efficiently
>>>>>> find
>>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on
>>>>>> an element which is already been previously set, it can find that
>>>>>> element with a binary search (rather than linearly searching the
>>>>>> arrays)
>>>>>> and change the value of that element.
>>>>>>
>>>>>> Therefore, the way the triplet storage is designed, there is will
>>>>>> never
>>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n])
>>>>>> will be unique for each n <= nz.
>>>>>>
>>>>>> Am I missing something?
>>>>>>
>>>>>> Patrick
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Alexis Tantet
>>
>>
>>
>



-- 
Alexis Tantet


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]